<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Bryan's Reflective Path]]></title><description><![CDATA[Bryan's Reflective Path explores the intersection of technology and life through a lens of thoughtful analysis and personal insight. Join Bryan on his journey as he navigates complex issues and shares valuable reflections.]]></description><link>https://www.bryantsai.com</link><image><url>https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png</url><title>Bryan&apos;s Reflective Path</title><link>https://www.bryantsai.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 19 May 2026 04:03:31 GMT</lastBuildDate><atom:link href="https://www.bryantsai.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Bryan Tsai]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[bryantsai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[bryantsai@substack.com]]></itunes:email><itunes:name><![CDATA[Bryan Tsai]]></itunes:name></itunes:owner><itunes:author><![CDATA[Bryan Tsai]]></itunes:author><googleplay:owner><![CDATA[bryantsai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[bryantsai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Bryan Tsai]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Semantic Layer with Unified Star Schema]]></title><description><![CDATA[A novel idea completes the generated star schema.]]></description><link>https://www.bryantsai.com/p/semantic-layer-with-unified-star-schema</link><guid isPermaLink="false">https://www.bryantsai.com/p/semantic-layer-with-unified-star-schema</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Mon, 12 Aug 2024 10:02:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!scCI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!scCI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!scCI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 424w, https://substackcdn.com/image/fetch/$s_!scCI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 848w, https://substackcdn.com/image/fetch/$s_!scCI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 1272w, https://substackcdn.com/image/fetch/$s_!scCI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!scCI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png" width="1023" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:1023,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1125719,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!scCI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 424w, https://substackcdn.com/image/fetch/$s_!scCI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 848w, https://substackcdn.com/image/fetch/$s_!scCI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 1272w, https://substackcdn.com/image/fetch/$s_!scCI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5ea66c-eedd-4783-b020-0f80255f76a6_1023x535.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I previously discussed <a href="https://www.bryantsai.com/p/generating-star-schemas-from-data-vault">Generating Star Schemas from Data Vault</a>, which served as the core foundation of our semantic layer. With a data vault containing all the entities and their relationships, we can generate dimension tables directly from entities (hubs/satellites) and fact tables from events (transactional links).</p><p>However, a manual operation is still required. Most events only contain references to their belonging entities. For analysis purposes, indirectly referenced entities often need to be identified and added to the generated fact tables. This is usually handled by ETL or ELT processes. Our semantic layer allows users to navigate the entity relationship graphs to specify the paths to the entities to be included in the generated fact tables. Then the corresponding ELT jobs are created by the system. It works great but adds some mental complexity to users and also creates an opportunity to introduce errors.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Generating Fact Tables with Indirect References</h2><p>First, let's illustrate the approach with a simplified real-world example. Imagine we want to measure the power usage of manufacturing equipment in the factory. We can use intelligent electricity meters to collect the measurements. The uploaded IoT payload from these meters, which contains the meter's ID and power consumption reading, serves as the event data we will use to create our star schema.</p><p>Some business systems manage equipment's asset record (like the correlation between electricity meters and equipment, the location and organization of the equipment, the operator of the equipment, etc.). We have replicated them from business systems to build a data vault containing all the entities and their relationships necessary for the analysis.</p><p>In the diagram below, the green paths starting from the event "evt_meter" in the graph must be specified to bring equipment, organization, and equipment type to the generated fact table.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e-Ht!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e-Ht!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 424w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 848w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e-Ht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png" width="1456" height="857" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:857,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106894,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e-Ht!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 424w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 848w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Ht!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb76dc935-a3df-4611-bcfa-4cd2eef33305_1578x929.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This approach supports incremental data loading and can be triggered by the incoming batches of meter events written into "evt_meter." The joins along these green paths dominate the performance overhead and can be adequately managed. The relationship graph does apply some limitations but also offers several options to eliminate loops and duplications.</p><h2>What is the Unified Star Schema?</h2><p>The primary intended use of the Unified Star Schema is to enable self-service BI. As its name suggests, the main idea is to use a single "unified" star schema to serve queries. So, instead of having many fact tables each surrounded by some dimensions, a single Puppini bridge table sits in the middle, surrounded by all dimensions and facts.</p><p>The Puppini bridge table has all primary key columns from dimension and fact tables. As for rows, it is the "union" of all dimension and fact tables. Naturally, it is sparse and filled with many null values.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3_V0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3_V0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 424w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 848w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 1272w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3_V0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png" width="1456" height="882" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:882,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146932,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3_V0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 424w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 848w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 1272w, https://substackcdn.com/image/fetch/$s_!3_V0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffea60e2b-9e26-438f-8d9c-be820caf1037_1464x887.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One tweak is that for rows coming from fact tables, any "indirect" foreign keys also need to be filled appropriately. The dash-edged areas in the diagram above are the original fact rows' filled "indirect" foreign keys.</p><p>This method addresses several commonly seen problems when using dimensional modeling.</p><p>First, fan and chasm traps<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>. The Unified Star Schema natively prevents both kinds of traps from happening. No data can ever be duplicated because it uses union instead of join.</p><p>Second, business users usually do not fully comprehend the details of all schemas. It is challenging to get queries correct, especially with multiple fact tables involved (drill-across). The second advantage of the Unified Star Schema is the increased ease of use. Queries always start from the Puppini bridge table, followed by one-level joins to any needed dimensions and facts. A single fixed query pattern makes it considerably easier for business users. Not only is the query more straightforward to write, but without traps.</p><p>Third, data-loss problems. Because all dimension keys are also included as separate rows in the bridge table, aggregation does not lose data, even those without transaction or event data within the queried range.</p><h2>Finding and Saving Indirect References Automatically</h2><p>You may have noticed the common operation from the previous two paragraphs: finding and saving indirect references. The Puppini Bridge table got me started on thinking about automating this operation.</p><p>Initially, we let users manually specify the green arrows in the first diagram. This is not trivial, though. With dozens of entities and all the relationships among them, users often feel lost in the graph. Most of the time, it became trial and error.</p><p>Thanks to our data vault approach, all the necessary information is available to find correlated entities. So, we later automated this manual operation. Instead of manually asking users to choose what entities to include, all reachable entities via any number of links are included by default. Every event table generates a fact table with all usable dimensions filled in.</p><p>It's important to note that our system supports all relationship cardinalities, including one-to-one, many-to-one, many-to-many, and one-to-many relationships. Assuming we start from the left, one-to-one and many-to-one relationships work fine since one row on the left matches, at most, one row on the right. No row 'expanding' is possible, i.e., one event row can never become more than one row in the generated fact table.</p><p>However, one-to-many and many-to-many relationships may expand rows. For example, a power usage measure of 10 could be duplicated multiple times with a sum of more than 10 (which is, of course, wrong).</p><p>We provide an option for these two relationship cardinality to specify the "allocation"<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> rule (per-measure level). You specify a percentage for each "n" side to distribute and allocate the measure's amount. Let's see a few use cases:</p><ul><li><p>An electricity meter is connected to and used by multiple machines. The power usage reported from the meter must be adequately "split" among the connected machines. Otherwise, it would be double-counted. In this case, the percentage for each "n" side should have a sum of 100%.</p></li><li><p>A sensor measures the indoor temperature. The reading is used for all equipment placed in the same room. In this case, 100% is set for each "n" side.</p></li></ul><p>This option provides the flexibility to support all kinds of use cases. The measure is multiplied by the allocation percentage for that target's row when generating the fact table. Let's see a many-to-many example:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mlQw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mlQw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 424w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 848w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 1272w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mlQw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png" width="994" height="375" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb1bf864-0106-4757-b55d-e82f858852a3_994x375.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:375,&quot;width&quot;:994,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40588,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mlQw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 424w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 848w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 1272w, https://substackcdn.com/image/fetch/$s_!mlQw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb1bf864-0106-4757-b55d-e82f858852a3_994x375.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the above diagram, meter1's power usage reading is evenly split between equip1 and equip2. This is the case for a single meter used by multiple equipment. As another interesting use case, equip1 uses two meters (potentially of two power sources), so its power usage is equal to the sum of meter1 and meter2 (20 + 5 = 25).</p><h2>Semantic Layer with Unified Star Schema</h2><p>Our automated fact table generation is similar to that of the Puppini Bridge table generation. Both find and save indirect foreign keys in fact tables. The difference is that our approach keeps fact tables independent and does not mix them physically.</p><p>When I first learned about the Unified Star Schema, one immediate concern was the joining between the (super large) bridge and (large) fact table(s). The bridge table only contains primary keys. It must join the fact table(s) to get to measures. Proper partitioning and fine-tuning of the bridge table seem challenging. One approach is to include measures in the bridge table as well, effectively duplicating all fact tables.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> Another approach is to have multiple physical stage tables, one for each fact table, and then create a view to union all the stage tables as the final bridge.</p><p>That is where a semantic layer shines. Implementation details do not matter to users. Internally, we use multiple physical fact tables. The semantic layer knows how to generate correct multi-pass SQL against those tables when doing drill-across queries. No fan or chasm trap exists, and there is no data loss either because the semantic layer knows how to avoid the traps.</p><h2>Conclusion</h2><p>The Unified Star Schema is quite interesting. Its novel idea of creating a Puppini Bridge table eliminates some of the most common problems in dimensional modeling and enables a more effortless self-service BI experience.</p><p>Inspired by the Puppini Bridge table idea, our star schema generation is fully automated and has complete dimensional references. This shows the beauty of the data vault.</p><p>A semantic layer provides the best of both worlds. To users, it shields the internal details and helps avoid the most commonly encountered problems when they directly write queries against tables. Implementation-wise, it can use the most optimized schema without worrying whether its users can use it effectively.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>https://showmethedata.blog/sql-traps-unified-star-schema</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>From Kimball method's "allocated facts."</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>I feel that might be a viable solution, particularly if the original fact tables can be safely dropped.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><a href="https://www.youtube.com/watch?v=ZxjhplaQWRA">https://www.youtube.com/watch?v=ZxjhplaQWRA</a> mentioned a scenario of using Puppini Bridge with PowerBI, which seemed to have problems without measures included.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Optimize Data Lake Layout to Support Multiple Ordering]]></title><description><![CDATA[There is a way for a data lake to be physically clustered and sorted by multiple columns at the same time.]]></description><link>https://www.bryantsai.com/p/optimize-data-lake-layout-with-multiple-ordering</link><guid isPermaLink="false">https://www.bryantsai.com/p/optimize-data-lake-layout-with-multiple-ordering</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Mon, 29 Jan 2024 11:01:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KPo0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KPo0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KPo0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 424w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 848w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 1272w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KPo0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png" width="1456" height="968" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:968,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3185969,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KPo0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 424w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 848w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 1272w, https://substackcdn.com/image/fetch/$s_!KPo0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcefecbd3-1a6f-469b-928a-de30138ed599_1744x1160.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Imagine a million excavators each sending one message every few seconds. That's 10GB every few seconds and a few TB every hour. And we must never delete the data.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> So, we use Apache Hudi to store the data archives. It serves many batch data processing pipelines. These data pipelines have their own data loading patterns. Some analyze certain kinds of excavators. Some only analyze excavators within specific cubic capacity (CC) ranges. Some analyze the excavator activity within specific areas.</p><p>But there's one thing in common: they are not fast. Most pipelines run in hours. They read large volumes of data and spend most of the time in the reading of data files.</p><p>The problem is that most data pipelines read <em><strong>ALL</strong></em> the data first to filter the needed parts. It's impractical to have duplicated data sets with different partitioning schemes and clustering layouts because of the size and the corresponding huge cost.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>But, we do have many frequently-running data loading and querying jobs. They keep reading the same complete data set over and over again while most only need a partial of it. Is there a better way?</p><h2>The Storage Model</h2><p>We partition our excavator data by day. The record key includes the excavator ID and the timestamp. </p><p>At first, we employed the bucket hashing clustering strategy<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>, with the excavator ID as the bucket field. This is suitable for retrieving individual excavator data as we only need to read one bucket (out of 128, our configured bucket number). But this is not so good for loading many excavators matching some criteria. By definition, hashing distributes excavators among the buckets evenly. On average, we could expect that most large data pipelines end up reading all the buckets.</p><p>The column stats index<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> does not help either. It can only help skip reading files without a match. When most of the files indeed contain some matched excavators, there's nothing the data skipping can do.</p><p>Fortunately, Apache Hudi does have a clustering feature that can help.</p><h2>Data Clustering</h2><p>Apache Hudi's clustering<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> is a data layout optimization technique. It stitches together smaller files and also orders the data by sort key to enable data locality.</p><p>Let's say we have a table with 2 columns A and B. We choose (A, B) as the clustering key. The clustering split files by the unit of 4 records. Below is a simplified illustration. The records are sorted first by column A and then by column B.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mO61!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mO61!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 424w, https://substackcdn.com/image/fetch/$s_!mO61!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 848w, https://substackcdn.com/image/fetch/$s_!mO61!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 1272w, https://substackcdn.com/image/fetch/$s_!mO61!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mO61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png" width="837" height="183" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:183,&quot;width&quot;:837,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56886,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mO61!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 424w, https://substackcdn.com/image/fetch/$s_!mO61!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 848w, https://substackcdn.com/image/fetch/$s_!mO61!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 1272w, https://substackcdn.com/image/fetch/$s_!mO61!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9b417a5-3140-4057-83fe-cf858d69b9f5_837x183.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Clustering by composite sort key</figcaption></figure></div><p>When we query for "A == 2", we only need to read file 3. But, when we query for "B == 2", we have to read all 4 files. Column stats index does help confirm all 4 files do contain matched data, but since it appears in all 4 files, we still need to read all 4 files.</p><p>We can see that the first column A dominates the ordering. When we query by only the non-prefix column B, there's not much ordering to use (hence not much locality). It "spreads" out to all files.</p><p>So, what we need is a clustering method that can order the data by 2 or more columns, while at the same time preserving the locality for each column alone.</p><h2>Space-Filling Curve</h2><p>A space-filling curve is a way of mapping a multi-dimensional space into a one-dimensional space. It can provide the nice "locality" property we desire.</p><p>The idea is that if all matched data is closer together on the "curve," sorting by the curve's ordering can put matched data in just a few clustered files.</p><p>Let's use the Z-order curve to get an intuition of how it works.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FkDp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FkDp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 424w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 848w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 1272w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FkDp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png" width="896" height="433" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:433,&quot;width&quot;:896,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:166910,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FkDp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 424w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 848w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 1272w, https://substackcdn.com/image/fetch/$s_!FkDp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F983668b8-a1b9-45af-b48f-62c0e217b8df_896x433.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Z-order curve v.s. normal sorting</figcaption></figure></div><p>The diagram above shows the comparison of using a Z-order curve and a normal ordering clustering method. A clustering of split size 4 is used. On the left is the Z-order sorted by (x, y), and the right is the normal ordering by (x, y).</p><p>If we query for "y == 7," using Z-order only requires reading 4 files (green boxes) while the normal ordering requires reading 8 files. This shows that the use of the Z-order curve does keep the number of data files read small when a non-first column is used alone for querying.</p><p>But, when querying for "x == 2", using normal ordering requires reading only 2 files (blue shaded boxes), while using Z-order requires reading 4 files. We are kind of giving up some of the filtering power of the first column for better selectivity for other non-first columns.</p><p>Note that the ordering here is not a strict ordering but a rough one. That is not a problem. The main point of using the space-filling curve is the locality it brings to the table. Combined with the column stats indexing, we can find and read the exact data with great efficiency.</p><p>If most of your queries only use a single column (or the composite key's prefix), it's better to use the normal ordering method. But if the majority of queries do use different sets of columns, then the space-filling curve method might be worth a shot. It works better with high cardinality columns. It is data-dependent and requires some experiments to find the one best suited for your situation.</p><p>In our case, we use the space-filling curve clustering<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a> with columns vehicle type and ID, timestamp, latitude, and longitude. It serves most frequently run queries pretty well.</p><h2>Conclusion</h2><p>For extremely large volumes of data, it might be impractical to have duplicated data sets. We can use space-filling curve clustering to efficiently serve multiple equally frequent query patterns. All major data lakes, including Iceberg, Hudi, and DeltaLake, support space-filling curve clustering. There is also Hilbert curve providing even better multi-dimensional data locality. A lake can have multiple orders at the same time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Never say never, but customers do get to say it.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Well, this assumption might not be true in all cases. If there is indeed some crazy job running dozens of times a day, it might be cheaper to dedicate a separate, duplicated data set (that is, caching).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://cwiki.apache.org/confluence/display/HUDI/RFC+-+29%3A+Hash+Index">RFC - 29: Hash Index</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><a href="https://www.onehouse.ai/blog/hudis-column-stats-index-and-data-skipping-feature-help-speed-up-queries-by-an-orders-of-magnitude">Hudi&#8217;s Column Stats Index and Data Skipping feature help speed up queries by an orders of magnitude!</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p><a href="https://hudi.apache.org/docs/clustering">https://hudi.apache.org/docs/clustering</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p><a href="https://hudi.apache.org/blog/2021/12/29/hudi-zorder-and-hilbert-space-filling-curves/">Hudi Z-Order and Hilbert Space Filling Curves</a></p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Your Timezone or My Timezone?]]></title><description><![CDATA[We gotta choose one. Which one are you going to choose?]]></description><link>https://www.bryantsai.com/p/your-timezone-or-my-timezone</link><guid isPermaLink="false">https://www.bryantsai.com/p/your-timezone-or-my-timezone</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Sun, 21 Jan 2024 22:00:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PV2X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PV2X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PV2X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 424w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 848w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 1272w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PV2X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png" width="800" height="532" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:532,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:361588,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PV2X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 424w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 848w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 1272w, https://substackcdn.com/image/fetch/$s_!PV2X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cf3c5d1-69ba-4d94-b6cc-b22e84e626b7_800x532.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Customer: "I'd like to see the daily average utilization ratio of all equipment globally. And I also need to be able to drill down to individual continents, countries, and districts."</p><p>Billy: "Sure. Which timezone should be used?"</p><p>Customer: "Good question. Our headquarters is in Eastern Standard Time. Let's use EST, then."</p><p>Billy went ahead to add the required metrics shortly.</p><p>Customer: "This is perfect. Oh, I forgot one thing. All offices should see this in their local timezones."</p><p>Billy: "Which timezones do we have to support?"</p><p>Customer: "I'm not sure, as we have quite a few locations. Can we support all the timezones? And I think people on travel also need to see this in the local timezone wherever they are."</p><p>Billy: "Okay. I think we can dynamically detect the local timezone, no problem."</p><p>Well, it turned out there were many significant problems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fnhw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fnhw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 424w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 848w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 1272w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fnhw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png" width="1100" height="553" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:553,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:644662,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fnhw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 424w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 848w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 1272w, https://substackcdn.com/image/fetch/$s_!fnhw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5460f643-5294-4c05-a677-ef8bf8aa108a_1100x553.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Too Many Timezones</h2><p>First, the world has around 40 timezones, more than just 24 that most people believe. Many countries and areas use 15, 30, and even 45-minute-off timezones. BTW, a country's timezone can change for various reasons, even adding new timezones to the world.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>Billy created separate pipelines, tables, and columns for each timezone. The more timezones, the higher the runtime cost becomes. It was not a viable solution.</p><p>Billy returned to the customer to seek to reduce the total number of timezones to be supported. In the end, he added the timezone management features. The customer can configure the set of timezones to be supported. A lot of complexity was introduced.</p><h2>Inconsistent Metrics</h2><p>In the customers' high-level operation meetings, executives found that the metrics on their screens occasionally differed from those on others' screens. This created quite a drama.</p><p>It turned out people's dashboards were based on their local timezones. The metrics were different here and there, which caused a lot of confusion.</p><p>The customer complained about the accuracy of the data. Soon, the system became untrustworthy.</p><h2>Proper Way to Handle Timezones</h2><p>To serve the purpose of effective communication, metrics must be consistent. Within a company, it is simply wrong to have personalized dashboards or reports for different time zones.</p><p>No matter where the viewer is around the world, the dashboard should always show the same values for the same metrics.</p><p>So, how do we handle timezone differences?</p><p>The answer to that is we don't. For analytics and reporting purposes, where the viewer is shouldn't matter. What matters is the timezone of the target objects. In this case, the location of the equipment.</p><p>For equipment in EST, we use EST to calculate their metrics. For equipment in PST, we use PST to calculate. The equipment utilization ratio would read "85% on Jan 20th" for daily metrics and "83% for 9:00 AM on Jan 20th" for hourly metrics. When we mix and compare equipment from both locations, we only care about "Jan 20th" for daily and "9:00 AM on Jan 20th" for hourly. For business reporting purposes, we don't compare based on the absolute same time (like comparing 9:00 AM EST to 6:00 AM PST).</p><p>Technically, the implementation is quite simple:</p><ul><li><p>Store all timestamps in UTC, along with the original timezone information. Ex. "2024-01-20T09:00:00-05:00" to "2024-01-20T14:00:00+00:00" and "EST", or "2024-01-20T09:00:00-08:00" to 2024-01-20T17:00:00+00:00" and "PST".</p></li><li><p>Generate date and hour dimensions based on the original timezone but without the timezone in the output. For example, "2024-01-20" and "09:00" are used for both.</p></li><li><p>Use the dimensions to generate the dashboards and reports.</p></li></ul><p>Billy no longer needs separate pipelines, tables, and columns for multiple timezones. And the customer gets a consistent dashboard.</p><h2>Other Common Timezone Questions</h2><p>A few other commonly asked timezone-related questions may again cause confusion.</p><p><em>How should we show equipment from multiple timezones together on the same screen?</em></p><p>Viewing the detailed, raw equipment data is more common for monitoring purposes. In this case, show the timezone information along with the timestamp. Make it easy for users to know which timezone individual equipment is in.</p><p>Alternatively, some customers ask for conversion to a single timezone on the screen, either detected from the browser's local setting or controlled by users.</p><p>Since we store all timestamps in UTC and keep the original timezone information, it is relatively easy to implement either way.</p><p><em>How do we deal with vehicles driving through multiple timezones?</em></p><p>Vehicles should still have an assigned "business" timezone; all reporting should use that as the base. The fact that they drive through multiple timezones does not make it any different. What matters is the elapsed time and the trajectory. Having a base timezone to "attach" the correctly calculated result is all we need.</p><h2>Conclusion</h2><p>Always question the requirements and always make sure to talk to the right stakeholders.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> In the case above, the specific customer was from the IT department and was unfamiliar with the business requirements. By unthinkingly following the wrong requirements, the system's complexity and overall cost increase. Worse yet, the system becomes untrustworthy to the actual users.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><a href="https://www.reuters.com/article/us-venezuela-time/call-me-cuckoo-its-chavez-time-in-venezuela-idUSN1927682520070919/">https://www.reuters.com/article/us-venezuela-time/call-me-cuckoo-its-chavez-time-in-venezuela-idUSN1927682520070919/</a>. I honestly cannot imagine the kind of chaos such change caused.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><a href="https://www.inc.com/jeff-haden/elon-musks-algorithm-a-5-step-process-to-dramatically-improve-nearly-everything-is-both-simple-brilliant.html">https://www.inc.com/jeff-haden/elon-musks-algorithm-a-5-step-process-to-dramatically-improve-nearly-everything-is-both-simple-brilliant.html</a></p></div></div>]]></content:encoded></item><item><title><![CDATA[Dealing with Late-Arriving Data in Data Vault]]></title><description><![CDATA[There is always some late-arriving data. How do we deal with it in Data Vault?]]></description><link>https://www.bryantsai.com/p/dealing-with-late-arriving-data-in-data-vault</link><guid isPermaLink="false">https://www.bryantsai.com/p/dealing-with-late-arriving-data-in-data-vault</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Fri, 05 Jan 2024 11:32:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!K5ZN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K5ZN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K5ZN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 424w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 848w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 1272w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K5ZN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png" width="1456" height="730" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:730,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2781929,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K5ZN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 424w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 848w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 1272w, https://substackcdn.com/image/fetch/$s_!K5ZN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9f6745d-02d1-4545-93df-f2fb3e3bf6c7_2880x1444.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is always some late-arriving data. An equipment was disconnected for a few hours and then came back online. Someone forgot to enter a purchase order into the system and found out a month later. A maintenance work order was not marked closed for a few days. It's just a normal part of business.</p><p>Late-arriving data handling is messy because there's no boundary on how late it can be. It could be a few seconds, a few minutes, a few hours, a few days, or even a few weeks. Because we don't know when it will come, or even whether there will be any late-arriving data at all, people tend to deliberately set a limit, say 7 days, and just periodically reprocess it for 7 days, just in case there is late-arriving data. That sucks.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>I talked about the <a href="https://www.bryantsai.com/p/maintaining-two-timelines-with-data-vault#%C2%A7late-arriving-data">late-arriving data problem</a> previously and how to deal with it by <a href="https://www.bryantsai.com/p/maintaining-two-timelines-with-data-vault#%C2%A7maintaining-two-timelines-with-data-vault">maintaining two timelines in Data Vault</a>. Now's the time to fill the gap, talking about some practical implementation details.</p><p>At a high level, there are two stages of processing required after receiving late-arriving data:</p><ul><li><p>Correctly saving the late-arriving data</p></li><li><p>Properly reprocess its derived data</p></li></ul><p>Let's see each stage in detail.</p><h2>Saving Late-Arriving Raw Data</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YFEn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YFEn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 424w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 848w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 1272w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YFEn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png" width="1125" height="501" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:501,&quot;width&quot;:1125,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112426,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YFEn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 424w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 848w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 1272w, https://substackcdn.com/image/fetch/$s_!YFEn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01e35403-6e35-41f3-a0a7-b95bed1ec008_1125x501.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Saving Late-Arriving Raw Data</figcaption></figure></div><p>With Data Vault, saving late-arriving "raw" data is no different from saving normally-arriving data. We maintain two timelines by also keeping the original business event time, so arriving late does not cause problems. Late-arriving data comes and sits right next to normally-arriving data nicely.</p><p>So this is pretty straightforward as it is taken care of by the normal ingestion pipeline and no special handling is required.</p><p>We've adopted an incremental loading pattern for Data Vault with our custom dbt materialization. This pattern requires the use of a monotonically increasing timestamp as the basis for incremental data processing. Each incremental batch first finds out from the target table the largest timestamp (essentially the point where the last run finished) and uses it as the lower bound to query the new data to be processed in the current batch. In this way, there's no special bookkeeping needed which makes it very simple.</p><p>To use it, simply set the materialization's config <code>timestamp_field</code>. By default, the system-maintained <code>load_datetime</code> is used. It first queries <code>max(load_datetime)</code> from the target table and then uses it as the lower bound to query the source table. <code>load_datetime</code> is system-maintained and is guaranteed to be always monotonically increasing.</p><pre><code><code>{{ config(
  materialized = "rc_incremental",
  timestamp_field = "load_datetime"
) }}</code></code></pre><p>The materialization's config also allows for explicitly specifying the data range by giving the <code>start_timestamp</code> and/or <code>end_timestamp</code> (both optional). If <code>start_timestamp</code> is given, it is used directly instead of querying from the target table. The combination of these 3 configs is useful for one-off data-loading jobs like backloading or incremental initial bulk loading.</p><pre><code><code>{{ config(
  materialized = "rc_incremental",
  timestamp_field = "load_datetime",
  start_timestamp = "...",
  end_timestamp = "..."
) }}</code></code></pre><p>So how is it implemented? In the dbt model, add a replacement stub for the filtering conditions when querying the source table:</p><pre><code><code>{% if model.config.materialized == 'rc_incremental' %}
AND __TIMESTAMP_INCREMENTAL_FILTER__
{% endif %}</code></code></pre><p>The materialization macro replaces the stub with the filtering conditions queried from the target table (and/or <code>start_timestamp</code>/<code>end_timestamp</code>):</p><pre><code><code>{%- set timerange_filter -%}
  {{ dbtvault.build_timestamp_filter(target_relation, timestamp_field, target_timestamp_field) }}
{%- endset -%}

{%- set sql = sql | replace("__TIMESTAMP_INCREMENTAL_FILTER__", timerange_filter) -%}</code></code></pre><h2>Refreshing Derived Data</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yyd5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yyd5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 424w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 848w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 1272w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yyd5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png" width="1125" height="501" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:501,&quot;width&quot;:1125,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:111945,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yyd5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 424w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 848w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 1272w, https://substackcdn.com/image/fetch/$s_!Yyd5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe4c3520-6852-403e-b45d-a7039efe2fd6_1125x501.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Refreshing Derived Data</figcaption></figure></div><p>Now that we have late-arriving "raw" data saved into the Data Vault, we still need to refresh any derived data based on it. Otherwise, the derived data would become inconsistent.</p><p>There are many kinds of derived data. There could be some business vaults generated from applying business rules (transformation) on the raw vault. There could also be some aggregation data generated from the raw vault for analytics or reporting purposes.</p><p>For transformed data, dbt's model dependency with the incremental strategy mentioned in the previous section can take care of it quite nicely. This is no different from handling the normally-arriving data. The same approach also works just fine for non-aggregation facts.</p><p>For entities (hub/satellite combos) and relations (link/satellite combos), we need to refresh their PIT tables to include the late-arrived data. To not always regenerate complete PIT tables, the affected data range can be identified by PIT table refreshing jobs. It first uses <code>load_datetime</code> to identify the new data for the current batch, and then from that new data set it identifies the set of <code>effective_datetime</code> (the business time) needed to be refreshed for the current batch. Our dimensions are dynamic views on top of PIT tables. As long as PIT tables are refreshed, dimensions automatically reflect the latest, correct data.</p><p>For aggregation facts, we use a different approach.</p><p>Not all aggregation methods are commutative like sum or count. Also, deletion (marker) needs to be handled properly. A more general drop-first-then-recreate approach is more suitable. When late-arriving data is received, any aggregation result it logically belongs to is purged first, and then the normal aggregation processing regenerates the result.</p><p>Our custom dbt materialization has another config for specifying the target table's column(s) as the replacement key(s). For example, a daily aggregation fact table contains a "date" column which should be set as the replacement key. When some late-arriving yesterday's data is received, it drops yesterday's aggregation results first and then re-generates the aggregation with the latest data (which now includes the late-arriving data).</p><pre><code><code>{{ config(
  materialized = "rc_incremental",
  timestamp_field = "load_datetime",
  replace_by = "date"
) }}</code></code></pre><p>Usually, we use a hashed composite key like <code>hash([date, device_id])</code> for the replacement key. This way, we can have finer control of the exact data set instead of having to always reprocess everything.</p><p>Note that this approach relies on the system-maintained <code>load_datetime</code> to determine the data range to be processed for each batch. This timestamp needs to be centrally generated, either by an orchestrator to assign it while scheduling jobs, or automatically generated by the database server.</p><h2>Conclusion</h2><p>Traditionally, late-arriving data handling was messy. Mostly, people ended up using daily jobs handling a rolling n-day window. This was not only slow but also expensive. In Data Vault, accepting and saving late-arriving "raw" data is no different from handling normally-arriving data. They come and get saved correctly together with normal, in-time arriving data. With the same automated, repeatable, and modular pattern, the refreshing of derived data also becomes easy. No more daily job reprocessing the same data over and over again. The reprocessing is launched only when there's late-arriving data, and only for the relevant part. This is fast and cheap. This makes a solid foundation for <a href="https://www.bryantsai.com/p/generating-star-schemas-from-data-vault">generating star schema from Data Vault</a>.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Maintaining Two Timelines with Data Vault]]></title><description><![CDATA[In business architecture, there are at least two views: the real-world view and our system&#8217;s view.]]></description><link>https://www.bryantsai.com/p/maintaining-two-timelines-with-data-vault</link><guid isPermaLink="false">https://www.bryantsai.com/p/maintaining-two-timelines-with-data-vault</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Tue, 05 Dec 2023 13:49:23 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/fa5e7b6c-231f-4a17-a398-feae1f219179_800x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PFbQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PFbQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 424w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 848w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 1272w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PFbQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e03224fb-5b02-4099-875c-e9b1091a91af_800x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PFbQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 424w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 848w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 1272w, https://substackcdn.com/image/fetch/$s_!PFbQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe03224fb-5b02-4099-875c-e9b1091a91af_800x400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a><figcaption class="image-caption">Multiverse Branching Timelines</figcaption></figure></div><h3>The Two Timelines</h3><p>The first principle of business architecture is never to discard information. Save all business information forever and keep track of all changes. This decouples the data processing from the data ingestion, so we are free to start a new analytical data processing or reprocess an existing one at any time with any slice of history.</p><p>In business architecture, there are at least two views: the real-world view and our system&#8217;s view. For example, a business contract is signed in the real world, and at a later time, the record of this contract enters our system. There is always some delay between the time the actual event occurs and the time our system records it.</p><pre><code>| System Load Date | Contract Num. | Business Date | Amount |
| ---------------- | ------------- | ------------- | ------ |
| 2023-11-02       | 101           | 2023-11-01    | 10,000 |
| 2023-11-16       | 101           | 2023-11-15    | 20,000 |</code></pre><p>In the example above, in the real-world contract 101 became effective on 2023&#8211;11&#8211;01 for an amount of 10K. Later, the amount of it was changed to 20K, effective from 2023&#8211;11&#8211;15. Since contract data is loaded into the system daily, the system load date is typically on the next day.</p><p>The two different worlds of view each have their separate timelines. Do we need to care about both timelines? Well, that depends on your requirements. Generally, there are two categories of questions we have to answer:</p><ul><li><p>What is contract 101&#8217;s amount, right now? (20K)</p></li><li><p>What was contract 101&#8217;s amount in the system on 2023&#8211;11&#8211;10? What was it on 2023&#8211;11&#8211;20? (10K and 20K, respectively)</p></li></ul><p>The first question certainly needs to be answered, as it pertains to actual business. The latter is typically for auditing purposes. If you do need to answer the second kind of question, then you need to maintain the second timeline for the system&#8217;s view.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!97C2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!97C2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 424w, https://substackcdn.com/image/fetch/$s_!97C2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 848w, https://substackcdn.com/image/fetch/$s_!97C2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 1272w, https://substackcdn.com/image/fetch/$s_!97C2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!97C2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!97C2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 424w, https://substackcdn.com/image/fetch/$s_!97C2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 848w, https://substackcdn.com/image/fetch/$s_!97C2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 1272w, https://substackcdn.com/image/fetch/$s_!97C2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82de0343-abcd-48fd-9a73-7d7382d56496_625x432.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Different Views/Timelines</figcaption></figure></div><p>However, even if you don&#8217;t have requirements for the second kind of question right now, how can you be sure your business users won&#8217;t change their minds later? Since the system timeline is unrecoverable after the initial ingestion process is done, we&#8217;d better keep it right from the beginning in any way.</p><p>So, how do we maintain two timelines?</p><p>For each event entering our system, we need to record both:</p><ul><li><p>When this event happens in the real world.</p></li><li><p>When this event is recorded by our system.</p></li></ul><p>The ability and flexibility we want is to be able to answer both categories of questions above. That&#8217;s what we want to enable in the architecture.</p><h3>Late-Arriving Data</h3><p>At this point, you might wonder whether we can just use the system&#8217;s timeline to represent the real-world timeline. The answer to that comes down to the problem caused by late-arriving data doing retroactive updates.</p><pre><code>| System Load Date | Contract Num. | Business Date | Amount |
| ---------------- | ------------- | ------------- | ------ |
| 2023-11-02       | 101           | 2023-11-01    | 10,000 |
| 2023-11-16       | 101           | 2023-11-15    | 20,000 |
| 2023-11-22       | 101           | 2023-11-01    | 15,000 |</code></pre><p>Let&#8217;s expand the same contract example a bit. On 2023&#8211;11&#8211;22, a correction was made to change contract 101&#8217;s amount to 15K effective from 2023&#8211;11&#8211;01. This is the so-called late-arriving data doing retroactive updates. The reason might be a correction to human error or for some business reason, but it does not matter. From the real-world view, once the correction was loaded into the system, contract 101&#8217;s amount on Nov 1 should be 15K, no longer be 10K.</p><p>You can think of the real-world view as the system&#8217;s &#8220;as-of-now&#8221; view. You&#8217;ll also notice below the system view as of 2023&#8211;11&#8211;30 changes because the correction was applied on 2023&#8211;11&#8211;22.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hGPG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hGPG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 424w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 848w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 1272w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hGPG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hGPG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 424w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 848w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 1272w, https://substackcdn.com/image/fetch/$s_!hGPG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a2c92-0cfd-436e-bcf2-b6c675282b12_626x433.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Different Views/Timelines after a Retractive Update</figcaption></figure></div><p>If we only keep system load dates and treat them as real-world business dates, we will mistakenly identify contract 101&#8217;s amount to be 10K from 11&#8211;02 to 11&#8211;16, 20K from 11&#8211;16 to 12&#8211;01, and 15K from 12&#8211;01 to now. This is not the expectation of correcting the amount to be 15K from 11&#8211;02 to 11&#8211;16.</p><p>Note that many source business systems do not provide real-world business dates, and the best we can get (and approximate) is using our system&#8217;s load dates as the business dates. In these cases, this is the best we can do. For other cases, when business dates are indeed available from the source systems, keep and use them.</p><p>One more thing about late-arriving data. There might be downstream data processing (directly or indirectly) affected by the late-arriving data, which needs to be identified and reprocessed automatically. All the affected time ranges as well as all the existing results processed with data from those time ranges need to be identified and reprocessed.</p><h3>Maintaining Two Timelines with Data&nbsp;Vault</h3><p>Below are the standard table schemas for all our data vault constructs. We standardize not only the schema but also all the table and column names.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cv6V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cv6V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 424w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 848w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 1272w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cv6V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png" width="818" height="207" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:207,&quot;width&quot;:818,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Cv6V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 424w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 848w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 1272w, https://substackcdn.com/image/fetch/$s_!Cv6V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4a6b7a3-ae9c-4d3a-b8cd-871bd524c38e_818x207.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Data Vault Construct Standardized Schemas</figcaption></figure></div><p>The system timeline uses the data vault&#8217;s standard <code>load_datetime</code> column. For the real-world business timeline, column <code>effective_datetime</code> is added to the satellite (transactional links do not have associated satellites hence are directly added to the links themselves).</p><p>As mentioned in the previous section, whenever the source systems do provide a business timeline, we recommend using that. Otherwise, the best is to use the system timeline to approximate it. If users do not write values to <code>effective_datetime</code>, our system automatically uses the value from <code>load_datetime</code> to write it.</p><p>That finishes the normal ingestion flow. What about the late-arriving data mentioned above?</p><p>Remember in a data vault, satellite data is compressed in SCD2 structure. Because of that, things get a little tricky because we need to consider the chronologically immediate neighbors, both the previous and the next, to decide what and how to insert into satellites.</p><p>Let&#8217;s use a simple example. Originally, both T1:(1) and T3:(3) were already processed and entered into the data vault. Since satellites are in SCD2 structure, if T1:(1) and T3:(3) have the same value, only T1:(1) is saved in the satellite. On the other hand, if they have different values, both T1:(1) and T3(3) are saved in the satellite. T2:(2) is the late-arriving data, which is chronically between T1 and T3.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!laHR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!laHR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 424w, https://substackcdn.com/image/fetch/$s_!laHR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 848w, https://substackcdn.com/image/fetch/$s_!laHR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 1272w, https://substackcdn.com/image/fetch/$s_!laHR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!laHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!laHR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 424w, https://substackcdn.com/image/fetch/$s_!laHR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 848w, https://substackcdn.com/image/fetch/$s_!laHR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 1272w, https://substackcdn.com/image/fetch/$s_!laHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e5d2189-a021-4383-bad4-6a6d2d4c0099_239x163.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>There are 5 possible scenarios to be considered:</p><p>1. Originally both T1:(1) and T3:(3) are A, hence only one row in the satellite. The late-arriving T2:(2)=A is the same as both, hence only one row in the satellite at the end.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hqjg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hqjg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hqjg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hqjg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!hqjg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18f36a6b-8036-438a-ac96-80920bc1b54c_300x120.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>2. Originally both T1:(1) and T3:(3) are A, hence only one row in the satellite. The late-arriving T2:(2)=B is different from both T1 and T3, hence we have to insert two rows, one for T2:(2)=B and one for T3:(3)=A to correctly save the history in the satellite. There are 3 rows at the end.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g3_f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g3_f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g3_f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g3_f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!g3_f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30fd1faf-7458-4c1a-9d54-6fa78397c3a9_300x120.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>3. Originally T1:(1) is A and T3:(3) is C, hence two rows are saved in the satellite. The late-arriving T2:(2)=A is the same as T1, hence there&#8217;s no need to insert a new row for T2. There are still two rows at the end.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oCYT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oCYT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oCYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oCYT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!oCYT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55dd45a1-02e6-4bf5-a15d-aee3e08a6ab9_300x120.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>4. Originally T1:(1) is A and T3:(3) is C, hence two rows are saved in the satellite. The late-arriving T2:(2)=B is different from T1:(1)=A and also different from T3:(3)=C, hence we need to insert a row for T2. There are three rows at the end.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oQpm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oQpm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oQpm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oQpm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!oQpm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0160dd15-8fb6-4ac8-ac64-2bddf4f8d5f8_300x120.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>5. Originally T1:(1) is A and T3:(3) is C, hence two rows are saved in the satellite. The late-arriving T2:(2)=C is different from T1:(1)=A but the same as T3:(3)=C. We have to insert a row for T2. There are three rows at the end. Note that T2:(2)=C and T3:(3)=C have the same value and normally should be compressed to leave only one row for T2. However, to &#8220;never discard information&#8221;, we choose to keep them both.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XdjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XdjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XdjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XdjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 424w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 848w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 1272w, https://substackcdn.com/image/fetch/$s_!XdjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff89312a7-1555-4b02-8b09-67c2c2a3bc60_300x120.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Patrick Cuba has an awesome <a href="https://patrickcuba.medium.com/data-vault-2-0-has-a-new-hero-a8335d961210">article</a> on how to use an extended record-tracking satellite (XTS) to support this processing.</p><p>As mentioned in the previous section, after T2:(2) is processed and entered into the satellite, any downstream processing affected by this change needs to be identified and reprocessed.</p><h3>Querying the Two Timelines</h3><p>With the two timelines maintained in our data vault, how do we query them?</p><p>Remember the two categories of questions we need to answer:</p><ul><li><p>What is contract 101&#8217;s amount, right now?</p></li><li><p>What was contract 101&#8217;s amount in the system on 2023&#8211;11&#8211;10? What was it on 2023&#8211;11&#8211;20?</p></li></ul><p>We let users choose with which to generate dimensions. Users can generate different versions, probably one for the business timeline view and another for the system timeline view.</p><p>For business-view dimensions, our system first generates the underlying business-view PIT tables. We use snapshot PIT tables, so we first group by the column <code>effective_datetime</code>, and within each group, we select the one row with the latest <code>load_datetime</code> value. So, for each snapshot date, the PIT table rows all point to the correct satellite row based on the business timeline. Finally, the dimension is created by joining the PIT tables to the base hub and satellite tables, with the column <code>effective_datetime</code> included as the time base.</p><p>For system-view dimensions&#8217; PIT tables, the only difference is that we pre-filter the data with snapshot dates on the column <code>load_datetime</code>. This ensures that for each snapshot date, only the system's viewable data is present. Then we follow the same process of generating business-view dimensions.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6z0n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6z0n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 424w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 848w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 1272w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6z0n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6z0n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 424w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 848w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 1272w, https://substackcdn.com/image/fetch/$s_!6z0n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc68fd8d-3677-4002-9da5-fc6bafcba1b6_800x538.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Separate Dimensions for Different Views</figcaption></figure></div><p>For facts (transactional links), the process is the same. Therefore, users can choose to use the business-view version to answer the first question and to use the system-view version to answer the second question.</p><p>Note that the business-view PIT tables do need to be regenerated for the affected parts, which can be done incrementally whenever there is late-arriving data. However, the system-view PIT tables do not need to be regenerated because <code>load_datetime</code> is always monotonically increasing.</p><h3>Conclusion</h3><p>Unlike the <a href="https://en.wikipedia.org/wiki/Time_Variance_Authority">TVA</a>, we don&#8217;t have ever-increasing multiverse timelines to manage or prune. However, we do have two &#8220;sacred&#8221; timelines to maintain. The &#8220;real-world&#8221; timeline serves the business requirements, and keeping the full history of this timeline allows for the retrieval of historical values. The &#8220;system&#8221; timeline serves the auditing requirements to ensure that retroactive updates do not go rogue. Maintaining the two timelines in a data vault can be easily done with an automated, repeatable, and modular pattern applied. Together with <a href="https://www.bryantsai.com/p/generating-star-schemas-from-data-vault">systematic star schema generation from the data vault</a>, it serves as a solid foundation for building analytical solutions. Let the Time-Keepers protect your timeline for you.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Generating Star Schemas from Data Vault]]></title><description><![CDATA[Our RootCloud Platform provides an easy way for users to construct their enterprise data vault to generate dimensional star schemas&#8230;]]></description><link>https://www.bryantsai.com/p/generating-star-schemas-from-data-vault</link><guid isPermaLink="false">https://www.bryantsai.com/p/generating-star-schemas-from-data-vault</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Fri, 24 Nov 2023 09:51:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hK0g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hK0g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 424w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 848w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 1272w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hK0g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png" width="1456" height="651" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:651,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5733963,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hK0g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 424w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 848w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 1272w, https://substackcdn.com/image/fetch/$s_!hK0g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1686e1d2-a643-490d-b8f5-31c3873527d9_2880x1288.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Our <a href="https://en.rootcloud.com/rootcloud-platform">RootCloud Platform</a> provides an easy way for users to construct their enterprise data vault. With it, our users can manage data vault&#8217;s table schema, ingest IoT data, incrementally import and transform IT data from external systems, connect BI tools to it, and build applications directly with its APIs. One special sauce we have is the direct integration of OT and IT data. As long as the &#8220;link&#8221; between OT and IT is available in the vault, our platform can automatically integrate them together to serve business queries.</p><p>There&#8217;s one caveat: data vault is not intended to be queried directly. Data vault brings many benefits to the table, however easy and fast querying is never one of those. Its creator Dan Linstedt recommends adding <a href="https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/">Kimball dimensional modeling</a> data marts as outputs specifically to serve business queries. This approach brings the best of the two modeling techniques together. That&#8217;s what our system provides.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vUAE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vUAE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 424w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 848w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 1272w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vUAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vUAE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 424w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 848w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 1272w, https://substackcdn.com/image/fetch/$s_!vUAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F987ce6ec-e17c-4d52-88bb-967c386a66b2_734x435.png 1456w" sizes="100vw"></picture><div></div></div></a><figcaption class="image-caption">Enterprise data vault paired exposed with star schema data&nbsp;marts.</figcaption></figure></div><p>The data vault serves as the single source of truth within the enterprise. Users can freely pick required entities, relationships, and transaction/event tables to create a star schema easily. The process is also super lightweight (mostly star schema is just composed of views on top of data vault tables) so that users can create as many star schemas as desired. That enables the agility necessary to serve the always-changing business requirements.</p><p>There are many advantages of this architecture:</p><ul><li><p>New data sources can easily be added and ingested to data vault without impacting the data marts.</p></li><li><p>Since all the data is in the data vault, dropping or recreating data marts is super easy.</p></li><li><p>Different teams can work together, in parallel, on different star schemas, based on the same data vault. No team is going to be disrupted by other teams.</p></li><li><p>High agility is achieved. PoC can be done easily by simply creating a new star schema, and populating data from the data vault. The result is either checked and published or rejected and scraped.</p></li><li><p>Business rules are implemented and governed in the business vault and, hence can be shared by all teams.</p></li></ul><h3>From Data Vault to Star&nbsp;Schema</h3><p>So how did we achieve it?</p><p>As we use dbt internally for populating the data vault, naturally we also want to use dbt to create and populate data marts. We have created a few dbt macros for generating star schema dimension and fact tables from data vault constructs. This conforms to our automated, repeatable, and modular patterns.</p><p>In our data vault, we model business entities as hub and satellite(s) combo, and relationship as link and satellite(s) combo. For transactional events or IoT measurement data, we use transactional links.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_Ihh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_Ihh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 424w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 848w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 1272w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_Ihh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a43d77a-1730-4025-9394-fd85a785805a_758x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_Ihh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 424w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 848w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 1272w, https://substackcdn.com/image/fetch/$s_!_Ihh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a43d77a-1730-4025-9394-fd85a785805a_758x666.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Mapping from data vault to generated star&nbsp;schema.</figcaption></figure></div><p>In a star schema generated, dimensions are based on entities, and facts are based on transactional links. Relationships are not mapped directly but are used to connect entities for snowflake-style usage or for denormalizing into flattened facts.</p><p>For dimension tables, we provide two flavors: with and without history tracking. You get to choose between SCD1 and SCD2 for individual dimensions (of course, our query API handles that for users automatically). In order to efficiently generate dimension tables, we take advantage of point-in-time tables (PIT). By periodically and incrementally generating PIT, our star schema dimension tables can simply be views defined on PITs together with their base hubs/satellites. This is key to our users as they get SCD2 for free, if and when needed.</p><p>Based on the awesome <a href="https://github.com/Datavault-UK/automate-dv">AutomateDV</a>, below is the dbt macro for generating a dimension table using PIT:</p><pre><code>{%- macro dimension(src_pk, src_nk, src_extra_columns, satellites, exclude_columns, src_ldts, source_model, pit) -%}

    {{- dbtvault.check_required_parameters(src_pk=src_pk, src_nk=src_nk,
                                           src_ldts=src_ldts,
                                           source_model=source_model) -}}

    {{- dbtvault.prepend_generated_by() }}

    {{ adapter.dispatch('dimension', 'dbtvault')(src_pk=src_pk, src_nk=src_nk,
                                                 src_extra_columns=src_extra_columns,
                                                 satellites=satellites,
                                                 exclude_columns=exclude_columns,
                                                 src_ldts=src_ldts,
                                                 source_model=source_model,
                                                 pit=pit) -}}
{%- endmacro -%}

{%- macro default__dimension(src_pk, src_nk, src_extra_columns, satellites, exclude_columns, src_ldts, source_model, pit) -%}

{%- set source_cols = dbtvault.expand_column_list(columns=[src_pk, src_nk, src_extra_columns]) %}
{%- set include_columns = [] %}
{%- set exclude_columns = exclude_columns | map("lower") | list %}
{%- for col in source_cols -%}
    {%- if col | lower not in exclude_columns -%}
        {% do include_columns.append(col) %}
    {%- endif %}
{%- endfor %}

{%- set pit_name = pit['pit_name'] -%}
{%- set pit_pk = pit['pk']['pk'] -%}
{%- set pit_ldts_name = pit['ldts'] %}
{%- set pit_as_of_date_name = pit['as_of_date'] %}

SELECT
    {{ dbtvault.alias_all(include_columns, pit_name | lower ~ '_src') }}
    {%- if pit_as_of_date_name %}
    ,{{ pit_name | lower ~ '_src' }}.as_of_date as {{ pit_as_of_date_name }}
    {%- endif %}
{%- for sat_name in satellites -%}
    {%- if 'extra_columns' in satellites[sat_name] and satellites[sat_name]['extra_columns'] -%},
    {%- set sat_extra_columns = satellites[sat_name]['extra_columns'] %}
    {{ dbtvault.alias_all(sat_extra_columns, sat_name | lower ~ '_src') }}
    {%- endif -%}
{%- endfor %}

FROM {{ ref(pit_name) }} AS {{ pit_name | lower ~ '_src' }}

{%- for sat_name in satellites -%}

{%- set sat_pk_name = (satellites[sat_name]['pk'].keys() | list )[0] -%}
{%- set sat_ldts_name = (satellites[sat_name]['ldts'].keys() | list )[0] -%}
{%- set sat_deleted_name = (satellites[sat_name]['deleted'].keys() | list )[0] -%}
{%- set sat_pk = satellites[sat_name]['pk'][sat_pk_name] -%}
{%- set sat_deleted = satellites[sat_name]['deleted'][sat_deleted_name] %}

LEFT JOIN {{ ref(sat_name) }} AS {{ sat_name | lower ~ '_src' }}
    ON {{ sat_name | lower ~ '_src' }}.{{ src_pk }} = {{ pit_name | lower ~ '_src' }}.{{ pit_pk }}
    AND {{ sat_name | lower ~ '_src' }}.{{ sat_ldts_name }} = {{ pit_name | lower ~ '_src' }}.{{ sat_name ~ '_' ~ pit_ldts_name }}
{% endfor %}

{%- endmacro -%}</code></pre><p>and the usage of this macro:</p><pre><code>{{ config(
  schema = "mart",
  materialized = "view",
) }}

{%- set yaml_metadata -%}
source_model: hub_ent_equip
src_pk: hk_ent_equip
src_nk:
  - source_column: equip_id
    alias: equip_id
src_extra_columns:
  - source_column: id_pit_ent_equip
    alias: id_ent_equip
pit:
  pit_name: pit_ent_equip
  pk:
    pk: hk_ent_equip
  ldts: load_datetime
  as_of_date: effective_datetime
exclude_columns:
  - hk_ent_equip
src_ldts: load_datetime
{%- endset -%}

{% set metadata_dict = fromyaml(yaml_metadata) %}

{{ dbtvault.dimension(source_model=metadata_dict['source_model'],
                      src_pk=metadata_dict['src_pk'],
                      src_nk=metadata_dict['src_nk'],
                      src_extra_columns=metadata_dict['src_extra_columns'],
                      satellites=metadata_dict['satellites'],
                      exclude_columns=metadata_dict['exclude_columns'],
                      src_ldts=metadata_dict['src_ldts'],
                      pit=metadata_dict['pit']) }}</code></pre><p>For fact tables, two options are available. For simple cases, say a transaction line item data persisted in a transactional link, the corresponding fact table is simply incrementally synchronized from the link table. The line items, products, and users referenced in the data all become the fact table&#8217;s foreign references to different dimension tables. This is a simple 1-to-1 mapping from data vault to the generated star schema.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sntF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sntF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 424w, https://substackcdn.com/image/fetch/$s_!sntF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 848w, https://substackcdn.com/image/fetch/$s_!sntF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 1272w, https://substackcdn.com/image/fetch/$s_!sntF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sntF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sntF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 424w, https://substackcdn.com/image/fetch/$s_!sntF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 848w, https://substackcdn.com/image/fetch/$s_!sntF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 1272w, https://substackcdn.com/image/fetch/$s_!sntF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40581935-fe8d-4bed-b6bc-d66610bf8dac_525x287.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Star schema dimension and fact&nbsp;tables.</figcaption></figure></div><p>For more complicated cases, usually, a chain of links and hubs need to be joined first to enrich the transactional links with the required dimensions. In this case, we allow specifying the &#8220;join chain&#8221; and the intended dimensions. Essentially, it creates a &#8220;denormalized&#8221; fact table. An example would be IoT data, which usually comes with just the sensor ID. Of course, we can go with snowflake, but we are really more star guys. By specifying the &#8220;join chain&#8221; from the sensor ID to its belonging equipment, maybe also further from equipment to the belonging organization, location, and work order, it is very easy to populate a required fact table to serve different business needs.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tBc9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tBc9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 424w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 848w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 1272w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tBc9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tBc9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 424w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 848w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 1272w, https://substackcdn.com/image/fetch/$s_!tBc9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1877cb85-ca0f-4061-9508-e3a2ba94efff_632x397.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Denormalize snowflake schema to star&nbsp;schema.</figcaption></figure></div><p>Remember since the data vault is the source of truth, any time a new business requirement comes in, we can either update an existing star schema suitable for the purpose, or simply drop-recreate one, or just go creating a new star schema. Either way, all other star schemas are not affected.</p><h3>Unified View of&nbsp;Data</h3><p>One of the most important things about a data warehouse is a shared understanding of the business entities and the business processes. This is largely served by the data vault. But what about the dimensional data marts generated from it? Wouldn&#8217;t a separate set of data marts conflict with that?</p><p>The dimension tables we generated are simply views on top of business entities (hubs and their satellites). Since entities are shared and are the single source of truth, all the dimension tables are guaranteed to be consistent even though different users have their own set of dimension view(s). Essentially, all users are using the same shared dimension (data).</p><p>As for fact tables, allowing different users to share physical fact tables has many downsides. Any change to the fact tables would affect someone else, that&#8217;s disruption we don&#8217;t want to have. To enable full agility, different users must have their own, isolated fact tables. Does that conflict with the goal of having a standard understanding of the business? I believe not, as long as the underlying is the shared data vault, isolated fact tables provide the necessary view of data to different presentation and/or visualization level requirements. And that&#8217;s the main point of having multiple data marts.</p><h3>Star Schema Query Performance</h3><p>There&#8217;s one thing worth mentioning. Originally, we used data vault&#8217;s hashkey directly in the generated star schemas. It works fine but the performance suffers as the table size increases. This is mostly due to the hashkey&#8217;s length (we use SHA256 which ends up as 128-byte long). For non-trivial fact tables, this became an issue both for the increased table size and the slowed-down query performance.</p><p>We have chosen to generate an integer sequential ID for data vault constructs, which is used only in the generated data marts (data vault part always uses hashkey). The sequence ID is stored together with the hashkey in hubs. In the generated star schema dimension and fact tables, the hashkey is removed and only the sequence ID is included. This improves the query performance at least by an order of magnitude.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gRKG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gRKG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 424w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 848w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 1272w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gRKG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gRKG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 424w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 848w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 1272w, https://substackcdn.com/image/fetch/$s_!gRKG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3ce1127-f37e-4f00-9713-6d758d698bef_598x415.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Star schema uses sequence ID instead of&nbsp;hashkey.</figcaption></figure></div><p>The creation of the sequence ID column can be standardized in the data vault construct dbt SQL file, like the following:</p><pre><code>{% set sequence_id = 'id_ent_equip' %}

{{ config(
  schema = "er",
  materialized = "rc_incremental",
  upsert_by = "hk_ent_equip",
  timestamp_field = "load_datetime",
  ignored_dest_columns = sequence_id
) }}

{% call set_post_sql_header(config) %}
  alter table {{ this }} add column if not exists {{ sequence_id }} integer generated always as identity;
  create unique index if not exists {{ sequence_id }}_{{ this.table }}_true_integer on {{ this }} ({{ sequence_id }});
{%- endcall %}

{%- set yaml_metadata -%}
source_model: stg_ent_equip
src_pk: hk_ent_equip
src_nk:
  - equip_id
src_ldts: load_datetime
src_source: record_source
{%- endset -%}

{% set metadata_dict = fromyaml(yaml_metadata) %}

{{ dbtvault.hub(src_pk=metadata_dict["src_pk"],
                src_nk=metadata_dict["src_nk"],
                src_ldts=metadata_dict["src_ldts"],
                src_source=metadata_dict["src_source"],
                source_model=metadata_dict["source_model"]) }}</code></pre><h3>Conclusion</h3><p>With an automated, repeatable, and modular pattern, we made data vault&#8217;s integration with Kimball dimensional data marts easy. This separation of concerns serves the business pretty well, as multiple teams get to work together, in parallel, on the same data vault. The evolution of the data vault typically does not affect the data marts (that&#8217;s what data vault is designed for), and the business reporting and analytics teams can iterate quickly and independently. And that is really the goal of using data vault.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Serverless Feed Aggregator]]></title><description><![CDATA[With Serverless Framework, OpenWhisk, and Cloudant]]></description><link>https://www.bryantsai.com/p/serverless-feed-aggregator</link><guid isPermaLink="false">https://www.bryantsai.com/p/serverless-feed-aggregator</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Thu, 25 May 2017 03:08:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0d34f875-853e-41e6-81e0-d07de2546452_1200x801.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4>With <a href="https://serverless.com">Serverless Framework</a>, <a href="https://console.ng.bluemix.net/openwhisk/">OpenWhisk</a>, and&nbsp;<a href="https://cloudant.com">Cloudant</a></h4><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rY3y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rY3y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rY3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rY3y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rY3y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ebe48c2-38db-458e-9384-d2de555cb767_1200x801.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a><figcaption class="image-caption">Photo by&nbsp;<a href="https://pixabay.com/en/users/bodsa-1622683/">Bodsa</a></figcaption></figure></div><p><a href="http://andrewchen.co/the-death-of-rss-in-a-single-graph/">RSS is long dead</a>. Yes, there is still <a href="https://feedly.com">Feedly</a>, but I found myself go there less and less often. One reason is that usually the interesting content ultimately got shared/recommended to me on either Facebook, Twitter, Medium, or <a href="https://getpocket.com">Pocket</a>. When I do go to Feedly (I have more than a hundred feeds there), I kind of marking read all the time&#8230; Not a good way to kill time.</p><p>On the other hand, for a small percentage less popular content (but must-read for me)[1], I do use <a href="https://ifttt.com/feed">IFTTT RSS Feed</a> to automatically save new content to my Pocket. It works perfect for me, except&nbsp;&#8230; I have to create one applet per feed and I have to manage them manually. I could use some feed aggregator (there are many options available) but most of them do not allow updating the mixed feed once created.</p><p>Well, that is a good opportunity (excuse?) for me to play with something new, more specifically:</p><ol><li><p>Event-based architecture</p></li><li><p>Serverless Framework with OpenWhisk</p></li></ol><h4>Event-based architecture</h4><p>It is data centric with various events emitted to trigger different actions. In my case, events can be data changes or time-based schedules.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QE-Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QE-Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 424w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 848w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 1272w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QE-Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QE-Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 424w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 848w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 1272w, https://substackcdn.com/image/fetch/$s_!QE-Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6510d32d-fdb9-4312-8100-81ea26324abc_800x301.png 1456w" sizes="100vw"></picture><div></div></div></a><figcaption class="image-caption">My Feed Aggregator Architecture</figcaption></figure></div><ol><li><p>The first area is around the feeds database which simply stores my interested feed URLs. Simple wrapper actions are created for adding-to/deleting-from feeds database.</p></li><li><p>The second area is feed crawling. The feed aggregator needs to constantly check each feed to see if any new items are published. A <code>cron</code> action is triggered every hour, iterating through all the feeds registered. It invokes <code>crawl</code> action to find new feed items and write them to items database.</p></li><li><p>The third area is the notification of new feed items. This is triggered by the OpenWhisk Cloudant feed whenever there&#8217;s a new item written to items database. I&#8217;m taking advantage of IFTTT to trigger the saving to my Pocket.</p></li></ol><p>Event-based flow makes actions feel more like &#8220;glues&#8221;, which is a good thing. Most actions are small in size and also mostly isolated from others. They only focus on one and only one &#8220;thing&#8221;. <code>put</code> and <code>delete</code> actions only deal with saving/deleting feed records. <code>cron</code> action only fetches the list of feed URLs from database and pass them to <code>crawl</code> action. <code>crawl</code> action only fetches feeds to parse the items to write them to database. <code>ifttt</code> action is triggered whenever there&#8217;s a new item written to items database and only notifies IFTTT with the received feed item URL.</p><p>This makes extending the system easy and safe. I later added a web action to return an RSS feed of the latest 20 items. A simple Cloudant query index was created to return the 20 items and the web action simply composes an RSS XML from it. Nothing already there was touched.</p><p>This is pretty trivial in functionality, but it does capture the spirit of a serverless architecture.</p><p>A few hacks related to Cloudant makes the action code simpler (and more efficient). One is I use feed and feed item URLs used for <code>_id</code> (Cloudant document unique identifier). After a feed is parsed, all the feed items are simply written to items database with a single bulk API. I didn&#8217;t bother to check each feed item to see if already existed (that would require 2 API calls per feed item). Since feed item URL is used as <code>_id</code>, Cloudant simply skips those items already exist. After all, I only want one notification for any feed item ever. BTW, the notification from OpenWhisk Cloudant feed only contains <code>_id</code>. Because the need URL is already available, a second GET request to Cloudant is saved.</p><p>Another hack is in Cloudant, when documents are deleted, they are only marked so by setting a special field <code>_deleted</code> to be true. If later another document with the same <code>_id</code> is added, the original document would be reused (with <code>_deleted</code> cleared and a new <code>_rev</code>). To avoid duplicate notification, the action only notifies when <code>_rev</code> starts with &#8220;1-&#8221; (which means it is the very first time added).</p><p>One more hack is that <code>crawl</code> action actually does batch processing (multiple feeds at a time). Initially, <code>cron</code> action simply dispatches one feed per <code>crawl</code> action. Most of what <code>crawl</code> action does is waiting for the return of feed retrieval and typically it takes about 2~3 seconds. Batching multiple feeds together for one <code>crawl</code> action invocation does not increase its processing time. Remember, OpenWhisk is billed by actual usage (time and memory). With 20 for the batch size, that becomes 1/20 in cost. That&#8217;s a significant cost saving.</p><p>You might notice that I use Cloudant npm module directly in my actions, instead of using <a href="https://console.ng.bluemix.net/docs/openwhisk/openwhisk_cloudant.html#openwhisk_catalog_cloudant">OpenWhisk&#8217;s pre-installed Cloudant package</a>.</p><p>Latency is not the reason as my Cloudant usage is at most one level (as I found out <a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-ccd1c65766d5">previously</a>). The main reason is simply the lack of functionality. For example, the pre-install Cloudant package only supports read/write by document ID. I can&#8217;t use bulk operation with it, nor can I use views with it.</p><p>The usage of a single <code>cron</code> action to iterate through all feeds is fine for my personal usage. After all, I only got like 10 feeds registered. What if you have a million feeds?</p><p>With a batch size of 20, that means 50 thousand <code>crawl</code> action invocations in parallel. I think it should still scale because FaaS is supposed to handle this well.</p><p>Okay, we might want to be nicer to the OpenWhisk platform so that it is not crashed or slowed down for our unnecessary workload. Or maybe, we just want to be&nbsp;&#8230; more fancy, like using different crawl frequency for different feed. We could actually create one cron per feed, by using per feed trigger/action:</p><pre><code>&gt; wsk trigger create feed-aggregator-dev-crawl-trigger-0001 \
  --feed /whisk.system/alarms/alarm \
  -p cron "0 * * * *" \
  -p trigger_payload "{\"url\":\"https://bryantsai.com/feed\"}"</code></pre><pre><code>&gt; wsk rule create feed-aggregator-dev-crawl-trigger-0001-rule \
  feed-aggregator-dev-crawl-trigger-0001 \
  feed-aggregator-dev-crawl</code></pre><p>This needs to be performed in the <code>put</code> action. When a new feed is registered, a cron trigger and a rule are created. When the trigger is&nbsp;&#8230; triggered, the payload would be passed as the parameter to the associated action. <code>delete</code> action needs to delete the corresponding trigger and rule. The benefit is we can now spread the workload across the hour (random minute/second). We can also adjust the schedule based on individual feed&#8217;s publishing frequency.</p><p>The downside of this, of course, is the extra complexity of managing all these per-feed triggers and rules. It is really cumbersome to set these up using the openwhisk npm module as 3 API calls are needed for both creation and deletion. The error handling is just&nbsp;&#8230; unmanageable as manual rollback is necessary when things go wrong in the middle. The other downside is cost as we are now back to one feed per action.</p><p>I&#8217;m happy with a single cron action.</p><h4><a href="https://serverless.com/framework/docs/providers/openwhisk/guide/">Serverless Framework with OpenWhisk</a></h4><p>Basically, Serverless Framework a tool making it much easier to manage your FaaS &#8220;projects&#8221;. If you have more than just a few actions, using it saves the trouble of creating lots of shell scripts.</p><p>Everything, except code, is declared in one file <code>serverless.yml</code>:</p><ol><li><p>Action default parameter</p></li><li><p>Feeds: for now, you can define <code>/whisk.system/alarms</code>, <code>/whisk.system/cloudant</code>, <code>/whisk.system/messaging</code> directly as events for actions and Serverless Framework automatically creates/manages the corresponding OpenWhisk triggers and rules for you.</p></li><li><p>Triggers and their parameters: for other OpenWhisk feeds, you can still define custom triggers and Serverless Framework still can automatically create/manage the corresponding OpenWhisk triggers and rules for you.</p></li><li><p>The association of actions and triggers: this is by defining the actions&#8217; events. Again, Serverless Framework manages the corresponding OpenWhisk rules for you.</p></li></ol><p>I can simply issue the command <code>serverless deploy</code> and everything will be set up appropriately on OpenWhisk. No longer the need to manage all those feeds/triggers/rules individually and manually. Managing all related artifacts has never been easier. Automation is the king.</p><p>That&#8217;s it! It even supports packaging actions as Node.js modules (see <code>crawl</code> action defined in <code>serverless.yml</code>). There&#8217;s really nothing else special about using Serverless Framework with OpenWhisk.</p><p>Using Serverless Framework with OpenWhisk is a no-brainer to me.</p><p>Note that at this point, I&#8217;ve encountered a few issues in using Serverless with OpenWhisk:</p><ol><li><p>It does not support the management of package binding yet. So I have to use a shell scripts <code>create-cloudant-binding.sh</code> which needs to be run first.</p></li><li><p>It does not support the new API gateway yet, so the http event does not work properly for me. I have to setup the API on OpenWhisk manually for my web action <code>feed</code>.</p></li><li><p>All action names are prefixed with Serverless Framework project name (<code>crawl</code> becomes <code>feed-aggregator-dev-crawl</code> in OpenWhisk). Rules automatically created are also named based on their associated actions/triggers. This helps a lot to OpenWhisk&#8217;s cluttered namespace. Except&nbsp;&#8230; custom trigger names are not managed in the same way.</p></li></ol><p>Here&#8217;s the project, have fun!</p><p><strong><a href="https://github.com/bryantsai/feed-aggregator" title="https://github.com/bryantsai/feed-aggregator">bryantsai/feed-aggregator</a></strong><a href="https://github.com/bryantsai/feed-aggregator" title="https://github.com/bryantsai/feed-aggregator"><br></a><em><a href="https://github.com/bryantsai/feed-aggregator" title="https://github.com/bryantsai/feed-aggregator">Contribute to feed-aggregator development by creating an account on GitHub.</a></em><a href="https://github.com/bryantsai/feed-aggregator" title="https://github.com/bryantsai/feed-aggregator">github.com</a></p><h4>Notes</h4><ol><li><p>In case you are interested, here are my must-read feeds:&nbsp;<br><a href="https://blog.acolyer.org/feed/,">https://blog.acolyer.org/feed/</a><br><a href="http://highscalability.com/rss.xml,">http://highscalability.com/rss.xml</a><br><a href="http://randsinrepose.com/feed/">http://randsinrepose.com/feed/</a><br><a href="http://stratechery.com/feed/">http://stratechery.com/feed/</a><br><a href="http://waitbutwhy.com/feed">http://waitbutwhy.com/feed</a><br><a href="http://steve-yegge.blogspot.com/feeds/posts/default?alt=rss">http://steve-yegge.blogspot.com/feeds/posts/default?alt=rss</a><br><a href="https://www.confluent.io/feed/">https://www.confluent.io/feed/</a></p></li></ol>]]></content:encoded></item><item><title><![CDATA[Serverless IoT Analytics with OpenWhisk Part 3 — How to keep state?]]></title><description><![CDATA[This is the third part of a series.]]></description><link>https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-3</link><guid isPermaLink="false">https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-3</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Wed, 10 May 2017 02:08:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1f0e0e30-7569-49bb-994f-48cb0b857eb9_1125x750.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ng-V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ng-V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ng-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ng-V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ng-V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0664257-6e1c-470a-849c-a92ea184ff23_1125x750.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This is the third part of a series.</p><ol><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-1-is-it-slower-4b66b5f42a5">Is it slower?</a></p></li><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-ccd1c65766d5">How to structure functions?</a></p></li><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-3-how-to-keep-state-3fdb429ca7de">How to keep state?</a></p></li></ol><p>FaaS is stateless, which means it provides neither long-term storage nor short-term memory. If we want to keep &#8220;something&#8221; across invocations, it has to be put somewhere externally.</p><p><a href="https://console.ng.bluemix.net/docs/services/IoT/analytics.html">Watson IoT Analytics</a> has two &#8220;states&#8221;. One is rules and triggering actions. The other is for stateful rule processing, the so-called &#8220;frequency requirement&#8221; on rules.</p><p>Let&#8217;s look at rules and triggering actions first.</p><p>In previous parts, I simply embed them in code. That works fine. But wait! Hard coded data? Shouldn&#8217;t we use a database?</p><p>That&#8217;s not hard to do, so let&#8217;s use <a href="https://console.ng.bluemix.net/catalog/cloudant-nosql-db">Cloudant</a> for storing rules and triggering actions first. Both rules and triggering actions are stored as documents in Cloudant. A map-reduce view is created to provide an easy way to get all rules with associated triggering actions embedded.</p><p>My revised OpenWhisk action now sends a query to Cloudant to get all rules (together with their associated triggering actions):</p><p>I&#8217;ve added 3 parameters for Cloudant related configuration. They need to be set as default parameters since my OpenWhisk action is called from <a href="https://console.ng.bluemix.net/catalog/services/message-hub">Message Hub</a> trigger which knows nothing about it.</p><pre><code>&gt; wsk action update iot-analytics-3 iot-analytics-3.js
&gt; wsk action update iot-analytics-3 -p cloudant_username &lt;cloudant account&gt; -p cloudant_password &lt;cloudant password&gt; -p cloudant_db &lt;name of cloudant database&gt;</code></pre><p>Now the fun part. Since we are using an external database, is it slower than embedded version?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V6HV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V6HV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 424w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 848w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 1272w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V6HV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V6HV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 424w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 848w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 1272w, https://substackcdn.com/image/fetch/$s_!V6HV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1192c181-ba98-4f5a-9229-0e699dcb7bbc_716x273.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>It seems not much. The 95th percentile duration is 0.64 (embedded) v.s. 0.68 (Cloudant) seconds. That&#8217;s close enough to be ignored. However, the 99th percentile duration shows the impact of the remote database query on the slower end (worst cases): 0.74 v.s. 1.02 seconds.</p><p>Note that I did use a simple cache[1] to avoid querying Cloudant for each invocation, which might be the reason of comparable latency. Most of the invocations simply used the cached data in memory. As mentioned in <a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-ccd1c65766d5">part 2</a>, OpenWhisk action instances do get &#8220;reused/cached&#8221; and this kind of optimization is pretty common (and very useful for performance).</p><p>For the record, this already works well enough for my use case.</p><p>BTW, Watson IoT Analytics does similar caching. In fact, it does not query database at all. All rules and triggering actions are kept in local cache and we have a notification mechanism to inject changes into running instances. This makes it possible to have extremely short processing time.</p><h4>Embedded v.s. External&nbsp;Database</h4><p>Before we move on, I want to drill down on this one further.</p><p>It&#8217;s probably against common (good) programming practice to embed data in code. But if the network to the database is not fast enough, embedded approach is one of few choices for near real-time IoT use cases.</p><p>I&#8217;ve been pondering on this for some time. The downside of embedded approach is increased complexity in function deployment management. When functions are deployed, we need some kind of template (fragment replacement) to inject the data into the code. This should be built into the deployment pipeline with the database (where the data stored) as input. We obviously need to track the &#8220;base&#8221; for every deployed function. Also, we need to track the link between a deployed function and the data injected. When new code is out or when data is changed, we need to replace/upgarde all related deployed functions.</p><p>That sounds complicated! But there are a few reasons that this may not be as bad an idea as it seems:</p><ol><li><p>Before FaaS, having a general code serving different &#8220;data&#8221; means we only need to run one or just a few servers. This is called <a href="https://en.wikipedia.org/wiki/Multitenancy">multitenancy</a>.</p></li><li><p>With FaaS, this no longer is necessary. Since FaaS only charges for real usage, we can conveniently deploy one instance per &#8220;tenant&#8221;. The unit can be per user or even per rule. No use, no cost, no worry.</p></li><li><p>FaaS also takes care of elastic scaling, by individual deployed function. For peak, off-peak and even no traffic period.</p></li><li><p>The data to be injected into one function is small. It is just a single or a few rules of one user.</p></li><li><p>FaaS deployment is both easy and fast. When data is changed, instant redeploying related functions is no big deal.</p></li></ol><p>If the latency of using external database is of concern, this could be a viable option.</p><p>That being said, as long as the external database works fine, I will not pursue this approach at least for now.</p><h4>Stateful Processing</h4><p>Watson IoT Analytics rules can have frequency requirement specified. It basically is count and/or time-based constraints controlling whether rule triggering is to be carried out.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ryef!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ryef!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 424w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 848w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 1272w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ryef!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c272c39c-e8a4-473b-b240-76187a509221_724x345.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ryef!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 424w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 848w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 1272w, https://substackcdn.com/image/fetch/$s_!Ryef!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc272c39c-e8a4-473b-b240-76187a509221_724x345.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Watson IoT Analytics&#8202;&#8212;&#8202;Rule Frequency</figcaption></figure></div><p>Stateful processing requires short-term storage for storing intermediate results. The fastest short-term storage can be in-process memory or host storage. Unfortunately, FaaS does not provide that. To implement the &#8220;frequency requirement&#8221; on rules, we need to use external storage for keeping the &#8220;state&#8221;.</p><p>For example, for the 3rd option &#8220;<em>Trigger only the first time conditions are met and reset when conditions are no longer met</em>&#8221;, we can save each incoming message&#8216;s timestamp along with the rule match result in Cloudant. Whenever there&#8217;s a rule match, we fetch the list of messages sorted by timestamp in descending order. If the immediate previous message did not have a match, we trigger the rule&#8217;s associated actions; otherwise, we don&#8217;t.</p><p>However, there&#8217;s another restriction of FaaS (at least the current options available on the market) which makes stateful &#8220;streaming&#8221; processing difficult. There&#8217;s no ordering guarantee on invocations. The following is from OpenWhisk&#8217;s documentation:</p><blockquote><p>Invocations of an action are not ordered. If the user invokes an action twice from the command line or the REST API, the second invocation might run before the first. If the actions have side effects, they might be observed in any&nbsp;order.</p></blockquote><blockquote><p>Additionally, there is no guarantee that actions will execute atomically. Two actions can run concurrently and their side effects can be interleaved. OpenWhisk does not ensure any particular concurrent consistency model for side effects. Any concurrency side effects will be implementation-dependent.</p></blockquote><p>The only way is to roll out our own &#8220;buffering/windowing/watermarking&#8221;. I doubt that worth the effort. At this point, if message order is required, I&#8217;d say use a streaming processing platform.</p><h4>Notes</h4><ol><li><p>The simple cache does not consider changes to the rules and/or triggering actions from Cloudant side. To add the cache stale data eviction, we can use OpenWhisk Cloudant trigger to receive notification of such changes and force reload the corresponding OpenWhisk actions accordingly.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[Serverless IoT Analytics with OpenWhisk Part 2 — How to structure functions?]]></title><description><![CDATA[This is the second part of a series.]]></description><link>https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-2</link><guid isPermaLink="false">https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-2</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Wed, 26 Apr 2017 06:13:45 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0d56f6c8-1a7d-4676-b5d1-99289a9c78d1_1017x750.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZpV-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZpV-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 424w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 848w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZpV-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZpV-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 424w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 848w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpV-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61be90d-2440-479e-8603-20b4bbf7bce5_1017x750.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This is the second part of a series.</p><ol><li><p><a href="https://medium.com/openwhisk/serverless-iot-analytics-with-openwhisk-part-1-is-it-slower-96278770a87b">Is it slower?</a></p></li><li><p><a href="https://medium.com/openwhisk/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-e0cb192174e">How to structure functions?</a></p></li><li><p><a href="https://medium.com/openwhisk/serverless-iot-analytics-with-openwhisk-part-3-how-to-keep-state-4b3bad818f0">How to keep state?</a></p></li></ol><p>My first take on implementing <a href="https://console.ng.bluemix.net/docs/services/IoT/analytics.html">Watson IoT Analytics</a> with <a href="https://console.ng.bluemix.net/openwhisk/">OpenWhisk</a> was pretty minimum in function. Now I&#8217;m going to start iterating. Here&#8217;s the event processing flow in Watson IoT Analytics.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cS6m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cS6m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 424w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 848w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 1272w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cS6m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cS6m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 424w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 848w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 1272w, https://substackcdn.com/image/fetch/$s_!cS6m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b6b9ed2-5a07-4b93-8446-d465be130ef9_800x256.png 1456w" sizes="100vw"></picture><div></div></div></a><figcaption class="image-caption">Watson IoT Analytics</figcaption></figure></div><p>An incoming event first goes through the rule evaluation and then a stateful processing. If the event passes through the rule filtering, it then checks if anything needs to be triggered and then triggers it. Part 1 implements just the rule evaluation and webhook posting, all within a single OpenWhisk action.</p><p>The next question is, how to structure my function (or functions) for adding those conditional branching logic?</p><h4>Latency</h4><p>There are at least 3 options available[1]. To answer the question, I need to first understand the latency of each option.</p><ol><li><p>Inline: all functions are put in a single OpenWhisk action.</p></li><li><p><a href="https://console.ng.bluemix.net/docs/openwhisk/openwhisk_actions.html#openwhisk_create_action_sequence">Sequence</a>: each function is an action and an OpenWhisk sequence is used to chain them together.</p></li><li><p><a href="https://www.npmjs.com/package/openwhisk">OpenWhisk npm package</a>: each function is an action and a main controller action uses the npm package to invoke and coordinate them.</p></li></ol><p>I tested the 3 options with 2 different setting. The first setting is just one level of function call (for the inline option):</p><pre><code>// increment.js
function main(params) {
 params.value = params.value || 0;
 return {value: inc(params.value)};
}</code></pre><pre><code>function inc(value) {
 return value + 1;
}</code></pre><p>The other is 10 nested level (for the inline option):</p><pre><code>// inline-10-level.js
function main(params) {
 params.value = params.value || 0;
 return {value: inc(inc(inc(inc(inc(inc(inc(inc(inc(inc(params.value))))))))))};
}</code></pre><pre><code>function inc(value) {
 return value + 1;
}</code></pre><p>for sequence option:</p><pre><code>// sequence-main.js
function main(params) {
 return params;
}</code></pre><pre><code>----</code></pre><pre><code>&gt; wsk action update sequence-1-level sequence-main,increment --sequence</code></pre><pre><code>&gt; wsk action update sequence-10-level sequence-main,increment,increment,increment,increment,increment,increment,increment,increment,increment,increment --sequence</code></pre><p>for npm package option 1 level (it&#8217;s too ugly to show the 10 level code&#8230;):</p><pre><code>// npm-1-level.js
var openwhisk = require('openwhisk');</code></pre><pre><code>function main(params) {
 params.value = params.value || 0;
 var p = {name: 'increment', blocking: true, result: true, params: params};
 var ow = openwhisk(); 
 return ow.actions.invoke(p).then(result =&gt; {
 return {value: result.response.result.value};
 });
}</code></pre><p>Let&#8217;s see the comparison result:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2ruO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2ruO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 424w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 848w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 1272w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2ruO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2ruO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 424w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 848w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 1272w, https://substackcdn.com/image/fetch/$s_!2ruO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb8d9579-58ad-47ac-8e41-b09b9d84b2cd_447x169.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">all time in milliseconds</figcaption></figure></div><p>No surprise here. Inline option being the fastest, sequence next, and npm package the worst. The sequence one has bigger variance though.</p><ul><li><p>As expected, inline option does not differ much in latency (1 v.s. 10 levels).</p></li><li><p>It&#8217;s understandable each action in a sequence adds some overhead. But it does add up. Also, the more actions in a sequence, the bigger variance it has.</p></li><li><p>OpenWhisk npm package uses <a href="https://console.ng.bluemix.net/docs/openwhisk/openwhisk_reference.html#openwhisk_ref_restapi">OpenWhisk REST API</a> to invoke actions, so each action invocation incurs overhead like network and authentication. It adds quite some latency.</p></li></ul><p>Note that the overhead in latency for both sequence and npm package options are fixed (per action). They are not proportional to action duration[2]. My action is short and fast (a few milliseconds) so the overhead seems especially severe. But if your action is on the seconds level, it probably is not a big issue as long as you use just a few nested levels.</p><p>What does this tell us? Unless your actions are all shorter than 10 ms, you don&#8217;t need to worry much about the latency overhead introduced by using either sequence or the npm package. Using sequence is faster than using npm package, but as I&#8217;ll show later, sometimes we have no choice because OpenWhisk sequence is limited (like you can&#8217;t short circuit in the middle, or you can only have one flow).</p><p>On the other hand, this does not suggest to blindly transform existing code to FaaS world, like one-to-one function to action mapping. That would be very very wrong. The unit of &#8220;function&#8221; should be anything related to the exposed &#8220;service&#8221;. There&#8217;s no point in further breaking down that unit on FaaS.</p><h4>Artifact Size</h4><p>Out of curiosity, I wonder whether the artifact size would affect actions&#8217; performance.</p><p>I tried using a 14MB zipped nodejs action. The normal duration stays pretty much the same, but those periodic spikes differ. For this 14MB size the spikes are around 2000 ms, compared to around 50 ms for the original version.</p><p>Since the spikes happen pretty consistent (once every 50 runs), it&#8217;s not entirely correct to say they don&#8217;t affect runtime performance. If you have a large artifact size, you should expect to have 2% chance to hit a slow run[3].</p><h4>Iteration #2</h4><p>I&#8217;ve made some changes to my action in part 1:</p><ol><li><p>I&#8217;ve split the webhook posting into its own action. This is because the webhook posting is intended to be a general &#8221;service&#8221; which can be used somewhere else.</p></li><li><p>If no event exceeds our threshold, no invocation is made to the separate webhook action. No unnecessary cost.</p></li><li><p>BTW, the webhook action can accept batch of events. Cost saving again.</p></li><li><p>Webhook body can now be customized, similar to Watson IoT Analytics.</p></li><li><p>Rule/webhook metadata are extracted out of the code, which opens the door for being passed in as action parameters.</p></li></ol><p>Note that the latency of this version does not differ from last one.</p><h4>Notes</h4><ol><li><p>Another option is to use <a href="https://console.ng.bluemix.net/catalog/services/message-hub">Message Hub</a> topics as queues between actions. Each action publish it outcome for downstream actions to consume. This option is considerably more cumbersome to setup and obviously not fast. I consider it more of external asynchronous integration.</p></li><li><p>I added a sleep inside my action to make it longer (a few seconds). In that case, the latency overhead stays the same as without the sleep introduced.</p></li><li><p>During the latency tests, I noticed one thing very interesting. Normally my function&#8217;s duration is around 3 ms, but it jumps to around 50 ms once every 50 invocations. I tried increasing the invocation rate from 1 event per 200 ms to 1 event per 1 ms. Both showed the same behavior. I guess some kind of caching is based on counts?</p></li></ol><p><em>Originally published at <a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-ccd1c65766d5">bryantsai.com</a> on April 26, 2017.</em></p>]]></content:encoded></item><item><title><![CDATA[Serverless IoT Analytics with OpenWhisk Part 1 — Is It Slower?]]></title><description><![CDATA[As I explored the serverless world (the FaaS area), I always wondered how it compared to the &#8220;normal&#8221; way. Would it slow things down? Is it&#8230;]]></description><link>https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-1</link><guid isPermaLink="false">https://www.bryantsai.com/p/serverless-iot-analytics-with-openwhisk-part-1</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Wed, 19 Apr 2017 08:07:37 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/367f13d4-7de4-4fcf-8cd5-2e2c62537c6e_1118x750.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gs7i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gs7i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 424w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 848w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 1272w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gs7i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Gs7i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 424w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 848w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 1272w, https://substackcdn.com/image/fetch/$s_!Gs7i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0bed46d-9201-491b-8824-87cfa2bef7bd_1118x750.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a><figcaption class="image-caption"><a href="https://www.pexels.com/u/skitterphoto/">skitterphoto</a></figcaption></figure></div><p>As I explored the <a href="https://martinfowler.com/articles/serverless.html">serverless</a> world (the FaaS area), I always wondered how it compared to the &#8220;normal&#8221; way. Would it slow things down? Is it really more cost-effective? What would be the right scope of a &#8220;function&#8221;? Would use it for stateful processing still perform?</p><p>This is the first part of a series as I tried to figure things out.</p><ol><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-1-is-it-slower-4b66b5f42a5">Is it slower?</a></p></li><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-2-how-to-structure-functions-ccd1c65766d5">How to structure functions?</a></p></li><li><p><a href="https://bryantsai.com/serverless-iot-analytics-with-openwhisk-part-3-how-to-keep-state-3fdb429ca7de">How to keep state?</a></p></li></ol><p>IoT analytics fits the event-driven pattern pretty well. A device sends in an event to trigger some analytics processing which runs on a FaaS platform. We have already built Watson IoT Analytics, so it&#8217;s perfect for me to compare.</p><p>Watson IoT Analytics has a lot of horsepower in the backend, the &#8220;servers&#8221;, so of course, it scales pretty well, both in the number of users and in the event throughput it is capable of handling. I want to implement the same analytics flow with OpenWhisk, without setting up my own servers. The &#8220;serverless&#8221; way.</p><p>To follow along, some preparation work:</p><ol><li><p>Have your Bluemix <a href="https://console.ng.bluemix.net/catalog/services/internet-of-things-platform">Watson IoT Platform</a>, <a href="https://console.ng.bluemix.net/catalog/services/message-hub">Message Hub</a>, and <a href="https://console.ng.bluemix.net/openwhisk/">OpenWhisk</a> instances ready.</p></li><li><p><a href="https://console.ng.bluemix.net/docs/services/IoT/message_hub.html">Connect</a> a historical data storage extension on Watson IoT Platform to the Message Hub instance you want to use. For simplicity, I use a default topic <code>all-devices</code> for everything.</p></li><li><p>On Watson IoT Platform, create the device type <code>iot</code> and one device under it named <code>thermal</code>.</p></li></ol><p>Let&#8217;s verify if everything is correctly setup. Use Message Hub REST API to create a consumer instance, then use Watson IoT Platform REST API to publish an event. The event should be forwarded to the configured Message Hub topic <code>all-devices</code>. Finally, use Message Hub REST API to fetch the message back from the topic.</p><pre><code>&gt; curl -X POST \
  -H "Content-Type: application/vnd.kafka.v1+json" \
  -H "X-Auth-Token: &lt;Message Hub api_key&gt;" \
  --data '{"name": "my_consumer_instance", "format": "binary", "auto.offset.reset": "largest"}' \
https://kafka-rest-prod01.messagehub.services.us-south.bluemix.net/consumers/my_json_consumer</code></pre><pre><code>&gt; curl -u "use-token-auth:&lt;device auth token&gt;" \
  -H Content-Type:application/json \
  --data-ascii "{\"temperature\":20}" \ https://&lt;orgId&gt;.messaging.internetofthings.ibmcloud.com/api/v0002/device/types/iot/devices/thermal/events/status</code></pre><pre><code>(A few seconds later ...)</code></pre><pre><code>&gt; curl -X GET \
  -H "Accept: application/vnd.kafka.binary.v1+json" \
  -H "X-Auth-Token: &lt;Message Hub api_key&gt;" \
      https://kafka-rest-prod01.messagehub.services.us-south.bluemix.net/consumers/my_json_consumer/instances/my_consumer_instance/topics/all-devices
[{"key":"eyJvJ0=","value":"eyJIn0=","partition":0,"offset":4}]</code></pre><pre><code>&gt; curl -X DELETE \
  -H "X-Auth-Token: &lt;Message Hub api_key&gt;" \
https://kafka-rest-prod01.messagehub.services.us-south.bluemix.net/consumers/my_json_consumer/instances/my_consumer_instance</code></pre><p>Note the forwarding to Message Hub could take a few seconds because Watson IoT Platform batches the events. My experiments told me the batch duration is likely around 6&#8211;7 seconds.</p><p>Now that our events can reach Message Hub, we can create an action on OpenWhisk to apply analytics rules. What the action does is to forward an event to a webhook endpoint if it&#8217;s temperature exceeds some threshold value. This is exactly what Watson IoT Analytics does, at least its most basic function.</p><p>The action code is super easy. Since OpenWhisk&#8217;s Message Hub feed does batch messages, we have to handle batches in our action. Once we have the code ready, the remaining work is to create an action with it and set it up with a trigger and a rule:</p><pre><code>&gt; wsk action create iotp-&lt;orgId&gt; ./iot-analytics.js
ok: created action iotp-&lt;orgId&gt;</code></pre><pre><code>&gt; wsk trigger create iotp-&lt;orgId&gt;-trigger --feed /_/Bluemix_serverless-iot-kafka_Credentials-1/messageHubFeed \
  --param isJSONData true \
  --param topic all-devices
ok: invoked /_/Bluemix_serverless-iot-kafka_Credentials-1/messageHubFeed with id 1458846b661c45789007a833cc819621
(... omitted ...)
ok: created trigger iotp-&lt;orgId&gt;-trigger</code></pre><pre><code>&gt; wsk rule create iotp-&lt;orgId&gt;-rule iotp-&lt;orgId&gt;-trigger iotp-&lt;orgId&gt;
ok: created rule iotp-&lt;orgId&gt;-rule</code></pre><p>I use <a href="http://waithook.com/">WaitHook</a> (which is super cool!) as the webhook end point so I can easily verify the result. Send some events again and you shall see them pop up on WaitHook page.</p><p>Not too bad, right? A few steps and we have our serverless IoT Analytics!</p><h4>Latency</h4><p>So back to my first question: is there any performance penalty for using FaaS?</p><p>To understand the latency of this solution, I tested a single device sending events with different rates: from 1 event per second to 1 event per 50 ms[4]. The OpenWhisk action posts to WaitHook in the end. I also open a websocket connection to WaitHook to stream back the response in order to measure the real end to end response time.</p><p>The end to end response time (1 through 4 marked in the chart below) is around 5.9 sec (95%) and 3.6 sec (median). This does not seem like fast at all. However, if we look at the OpenWhisk action&#8217;s latency, which is from 3 to 4 below, it is only 0.54 sec (95%) and 0.3 sec (median). This actually is quite good because it includes the webhook posting and the websocket communication back to my device, not just the action&#8217;s invocation. BTW, it&#8217;s 0.23 sec (95%) and 0.21 sec (median) from 1 to 2 below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HEEb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HEEb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 424w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 848w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 1272w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HEEb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d8895d7a-4c06-4205-94ca-eb316d599373_740x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HEEb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 424w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 848w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 1272w, https://substackcdn.com/image/fetch/$s_!HEEb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8895d7a-4c06-4205-94ca-eb316d599373_740x350.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">somewhat &#8220;scaled&#8221; to time spent on the edges&nbsp;&#8230;</figcaption></figure></div><p>So the major time spent is from 2 to 3 (more precisely, it&#8217;s the first part from Watson IoT to Message Hub). As I mentioned before, Watson IoT Message Hub extension seems to use a long batch duration, which is probably 6&#8211;7 sec. Unfortunately, there&#8217;s no tweaking for that available yet.</p><p>If this level of latency is acceptable for you, then it is perfect. But if you are looking for a near real-time IoT analytics, this is clearly not going to meet your needs. There&#8217;s really no easy &#8220;serverless&#8221; way[1] unless Watson IoT somehow provides shorter batch duration.</p><p>To see how fast it can be, even though not really &#8220;serverless&#8221;, we can use <a href="https://github.com/bryantsai/openwhisk-package-mqtt-watson">Watson IoT MQTT feed</a>[2]. This feed allows OpenWhisk actions to directly consume device events from Watson IoT without the intermediate Message Hub. However there&#8217;s an intermediate feed provider needs to be deployed. The main difference is, there&#8217;s no long batch.</p><p>Now the 95th percentile end to end response time is just 0.56 sec (95%) and 0.35 sec (median)! It is much much faster.</p><p>With this level of latency, I believe this is good enough for most IoT real-time analytics use cases. Of course, the downside is there&#8217;s an extra &#8220;server&#8221; to maintain/operate/scale.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r5im!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r5im!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 424w, https://substackcdn.com/image/fetch/$s_!r5im!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 848w, https://substackcdn.com/image/fetch/$s_!r5im!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 1272w, https://substackcdn.com/image/fetch/$s_!r5im!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r5im!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r5im!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 424w, https://substackcdn.com/image/fetch/$s_!r5im!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 848w, https://substackcdn.com/image/fetch/$s_!r5im!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 1272w, https://substackcdn.com/image/fetch/$s_!r5im!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd59a3d4-53da-4ddb-b3bd-b5acd63e915f_740x340.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">again, &#8220;scaled&#8221; to time spent on the edges&nbsp;&#8230;</figcaption></figure></div><h4>Scalability</h4><p>Obviously, it scales without me doing any work. Each incoming event triggers an action invocation on OpenWhisk. The more events or the faster they come, the more action instances are launched by OpenWhisk in parallel. The upper bound is only the <a href="https://console.ng.bluemix.net/docs/openwhisk/openwhisk_reference.html#openwhisk_syslimits">limit</a> imposed by OpenWhisk.</p><p>Better yet, when there&#8217;s no event, no resource is used. No usage, no cost!</p><h4>Cost</h4><p>Talking about cost. OpenWhisk charges by actual usage of time and memory. Both of my actions above generally take 300 ms. Unless your action is way faster than 100 millisecond (it rounds up to the next 100 ms), there&#8217;s really no need to think too much[3]. For OpenWhisk, just make sure you choose the minimum memory actually required by your action.</p><p>One aspect is worth considering though. The cost model is by action invocation. If we compare the two methods I use before. The Message Hub one roughly invokes my action once per 6 seconds (because Wastson IoT batches messages). The MQTT way on the other hand invokes my action once for each incoming message. Let&#8217;s say we send in 100 messages in a second, the difference in cost is huge!</p><p>This is obviously a tradeoff. I don&#8217;t want to have as high as 6~7 seconds latency, but I don&#8217;t need invocation per message either (too expensive). Also from the action&#8217;s processing time perspective, there&#8217;s not much difference between processing 1 or 10 messages at a time. So a better approach would be to enable batching for the MQTT feed.</p><h4>Development Experience</h4><p>As you can see, my OpenWhisk action code is so simple that I don&#8217;t have much to talk about. But considering how less I did and spent&nbsp;&#8230; I have to say I&#8217;m totally convinced this is the way to prototype.</p><p>One obvious and very enjoyable experience about developing OpenWhisk is: the deployment is damn fast! I usually just issue <code>wsk action update&nbsp;...</code> right after finishing my code change, then <em>immediately</em> start sending. It <em>ALWAYS</em> works as I expect. I haven&#8217;t yet encountered any problem on deployment yet.</p><p>One more thing about deployment. An action&#8217;s very first invocation takes longer than subsequent invocations. I use a bare minimum action to test, the first invocation (after update) usually takes 50 ms while subsequent takes just 3ms. I guess that&#8217;s &#8220;warming-up&#8221; time and OpenWhisk does some kind of &#8220;caching&#8221; for action instances. This is great for optimization opportunity.</p><p>Like Bluemix and Cloud Foundry, logging is not very convenient. But for this little experiment, command <code>wsk activation poll</code> suffices.</p><h4>Notes</h4><ol><li><p>Well, there is one actually: Watson IoT Analytics. For the curious mind, the end to end response time for using Watson IoT Analytics itself is 1.93 sec (95%) and 1.43 sec (median). This is mainly because of Spark Streaming&#8217;s micro-batch interval we use.</p></li><li><p>The original version was outdated so I updated it to work with current OpenWhisk. Also, I have encountered performance issue which turned out to be caused by the free Cloudant instance rate limit. I added a simple cache to the feed so that it does not go out to lookup Cloudant for each message.</p></li><li><p>If your action takes much less than 100 ms, then you should look for opportunity to aggregate multiple calls.</p></li><li><p>For the Message Hub approach, I did later test 1 event per 25 ms and 1 event per 10 ms, which amounts to 240 and 600 events per batch per invocation. Understandably, the processing time of each invocation takes longer since now it needs to wait the completion of all 240/600 webhook posting. However, the processing time just slightly increases to 6~7 seconds for 25 ms case and ~14 (though with much bigger variance) seconds for 10 ms case&nbsp;.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[How to Trigger OpenWhisk Actions from Watson IoT Analytics]]></title><description><![CDATA[Watson IoT Analytics rule triggering supports several action types: email, webhook, Node-RED, and IFTTT. In most cases, these actions allow&#8230;]]></description><link>https://www.bryantsai.com/p/how-to-trigger-openwhisk-actions-from-watson-iot</link><guid isPermaLink="false">https://www.bryantsai.com/p/how-to-trigger-openwhisk-actions-from-watson-iot</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Fri, 07 Apr 2017 12:01:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/635a3e16-7348-4e3d-b4e0-60a9b1ca2407_800x520.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lTEH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lTEH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 424w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 848w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 1272w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lTEH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lTEH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 424w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 848w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 1272w, https://substackcdn.com/image/fetch/$s_!lTEH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F027e2178-a5bd-48ba-bf59-588bde4e6900_800x520.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p><a href="https://console.ng.bluemix.net/docs/services/IoT/analytics.html">Watson IoT Analytics</a> rule triggering supports several <a href="https://console.ng.bluemix.net/docs/services/IoT/cloud_analytics.html#shared">action types</a>: email, webhook, <a href="https://nodered.org/">Node-RED</a>, and <a href="https://ifttt.com/">IFTTT</a>. In most cases, these actions allow you to integrate with almost any system. You can push a notification to mobile devices, post to a Slack channel, tweet it, persist to some data store, or even <a href="https://console.ng.bluemix.net/docs/services/IoT/devices/device_mgmt/requests.html">issue commands back to the devices</a>.[4]</p><p>But sometimes, we just need more. Maybe you don&#8217;t have an endpoint for receiving the notification. Maybe your endpoint is behind an enterprise firewall. Or, maybe you need some pre-processing applied before you receive the notification.</p><p>Many users use Node-RED for these kinds of use cases. The downside is an extra server set up and its maintenance. Also, the scaling of it has always been a challenge.</p><p>What if we could use any custom code as actions for Watson IoT Analytics, without setting up a server? That would be nice!</p><p>Let&#8217;s say we want to publish the notification from Watson IoT Analytics to a message queue so that we can easily decouple it from all downstream applications.</p><p>We are going to use <a href="https://console.ng.bluemix.net/openwhisk">OpenWhisk</a> for custom code and <a href="http://ibm.biz/message-hub-bluemix-catalog">Message Hub</a> for the message queue. Since OpenWhisk already has a built-in Message Hub package, we can even avoid writing our own code for this simple use case. First, create a package binding from <code>/whisk.system/messaging</code> with your Message Hub credentials parameters.</p><pre><code>&gt; wsk package bind /whisk.system/messaging myMessageHub \
    -p kafka_brokers_sasl "[\"kafka01-prod01.messagehub.services.us-south.bluemix.net:9093\", \"kafka02-prod01.messagehub.services.us-south.bluemix.net:9093\", \"kafka03-prod01.messagehub.services.us-south.bluemix.net:9093\", \"kafka04-prod01.messagehub.services.us-south.bluemix.net:9093\", \"kafka05-prod01.messagehub.services.us-south.bluemix.net:9093\"]" \
    -p kafka_admin_url https://kafka-admin-prod01.messagehub.services.us-south.bluemix.net:443 \
    -p user &lt;Message Hub User&gt; \
    -p password &lt;Message Hub Password&gt; \
    -p topic myTopic</code></pre><p>In case you are not familiar with OpenWhisk yet, package binding is just an alias for a package with some default parameters. By using package <code>myMessageHub</code> instead of <code>/whisk.system/message</code>, I don&#8217;t need to specify my Message Hub credentials over and over again. As you can see later, it also decouples the usage of this OpenWhisk action from the underlying Message Hub.</p><p>With this ready, you can test publishing some messages by invoking the action:</p><pre><code>&gt; wsk action invoke myMessageHub/messageHubProduce -p value "This is a test message"</code></pre><p>In addition to CLI, OpenWhisk also has REST API for invoking actions. Let&#8217;s try it out with <code>curl</code> first:</p><pre><code># first find out your OpenWhisk credentials
&gt; wsk property get --auth
whisk auth
&lt;user&gt;:&lt;password&gt;</code></pre><pre><code>&gt; curl -u &lt;user&gt;:&lt;password&gt; -H Content-Type:application/json --data-ascii "{\"value\":\"message body\"}" \
https://openwhisk.ng.bluemix.net/api/v1/namespaces/_/actions/myMessageHub/messageHubProduce?blocking=true</code></pre><p>That&#8217;s all we needs for Watson IoT Analytics webhook actions.[2]</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dwNX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dwNX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 424w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 848w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 1272w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dwNX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dwNX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 424w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 848w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 1272w, https://substackcdn.com/image/fetch/$s_!dwNX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F930d2db8-b8bd-4319-ad59-62cfed63e2a9_650x570.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Now when a rule is triggered, its webhook action would invoke OpenWhisk Message Hub publish action to publish the alert to the topic. The job is done!</p><p><em>Note: webhook action must use application/json content type and the body must contain the key &#8220;value&#8221; because we use the default Message Hub publish action.</em></p><p>For simplicity, I use OpenWhisk default Message Hub publish action directly. You certainly can write your own action and invoke it in the same way (just replace the package and action name).</p><p>What I&#8217;m showing here really is about applying any code logic (in any language!) to be triggered from Watson IoT Analytics. With OpenWhisk, you can truly integrate Watson IoT Analytics with any system. The sky&#8217;s the limit.</p><p>Oh, did I mention there&#8217;s no server needs to be setup and maintained? And, it <em>REALLY</em> scales?[3]</p><p>Notes:</p><ol><li><p>In most cases, this performs pretty well regarding processing throughput. However, if your triggering rate is too high for webhook style processing, it might be better to go with streaming style solution. That is a topic for another day.</p></li><li><p>If it bothers you that webhook body must use a key &#8220;value&#8221; to wrap the actual message content to be published to Message Hub, you can use your own OpenWhisk action. You need to use a sequence action within which the first action receives and wraps the payload (with the required key &#8220;value&#8221;), and the second action being the Message Hub publish action.</p></li><li><p>While IBM&#8217;s OpenWhisk does scale pretty well, there is currently some <a href="https://console.ng.bluemix.net/docs/openwhisk/openwhisk_reference.html#openwhisk_syslimits">limitation</a> set. I believe this will be relaxed and improved over time.</p></li><li><p>There is a <a href="https://developer.ibm.com/recipes/tutorials/create-a-rule-to-monitor-elevator-events-in-watson-iot-platform-and-trigger-an-openwhisk-action/">recipe</a> showcasing how to stop a device when some condition is met, in the same way by webhook action plus OpenWhisk action.</p></li></ol>]]></content:encoded></item><item><title><![CDATA[How to resolve dependency conflict out of your control]]></title><description><![CDATA[My project uses Apache Twill and Cassandra Java Driver. Twill uses Guava 13.0.1 while Cassandra Java Driver uses Guava 16.0.1. Even worse&#8230;]]></description><link>https://www.bryantsai.com/p/how-to-resolve-depend-conflict-out-your-control</link><guid isPermaLink="false">https://www.bryantsai.com/p/how-to-resolve-depend-conflict-out-your-control</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Wed, 05 Apr 2017 05:10:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f1ce22bc-b078-4e8d-890e-a2aa6cd5ace7_511x291.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gFkv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gFkv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 424w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 848w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 1272w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gFkv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2863b556-1988-427d-96fb-19a18100e3b6_511x291.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gFkv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 424w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 848w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 1272w, https://substackcdn.com/image/fetch/$s_!gFkv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2863b556-1988-427d-96fb-19a18100e3b6_511x291.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>My project uses <a href="http://twill.apache.org/">Apache Twill</a> and <a href="https://github.com/datastax/java-driver">Cassandra Java Driver</a>. Twill uses <a href="https://github.com/google/guava/">Guava</a> 13.0.1 while Cassandra Java Driver uses Guava 16.0.1. Even worse, because Twill and Cassandra Java Driver both leak the usage of Guava, my project also directly uses Guava (both versions).</p><p>The problem is that Guava 16.0.x is not backward compatible to 13.0.x and it is non-trivial work to upgrade from 13.0.x to it.</p><p>This isn&#8217;t that rare when you have some most common dependencies like Guava or <a href="https://hc.apache.org/">Apache HttpComponent</a>. Many open source projects use these libraries (of different versions), you ARE going to hit it if you have more than a few dependencies. Usually, the solution is simply to upgrade. We could just patch Twill&#8217;s Guava usage to 16.0.x, but Twill is not a small library and its usage of Guava is so fundamental that it takes a lot of effort to go with this option.</p><p>One quick and easy way is to shade the dependency. By shading, it means relocating/renaming the dependency to avoid conflicts.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dxew!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dxew!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 424w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 848w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 1272w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dxew!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dxew!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 424w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 848w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 1272w, https://substackcdn.com/image/fetch/$s_!Dxew!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7abf2da-d45d-44e4-9f3e-58dc6a50fa75_800x533.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>Both Gradle and Maven have plugins for shading. Okay, I can&#8217;t believe we&#8217;re still using <a href="https://bryantsai.com/dont-fight-maven-82a39e83b5bf">Maven</a>. Anyway, <a href="http://maven.apache.org/components/plugins/maven-shade-plugin/">Maven shade plugin</a> can directly relocate classes in addition to building a flat jar. That is while packaging all of a project&#8217;s dependencies into a single jar, it can also do optional class relocation if you want.</p><p>Here&#8217;s how to use it: add the following to Twill parent project POM&#8217;s <code>pluginManagement</code> and you get a shaded flat jar.</p><pre><code>&lt;plugin&gt;
  &lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
  &lt;artifactId&gt;maven-shade-plugin&lt;/artifactId&gt;
  &lt;version&gt;3.0.0&lt;/version&gt;
  &lt;executions&gt;
    &lt;execution&gt;
      &lt;phase&gt;package&lt;/phase&gt;
      &lt;goals&gt;
        &lt;goal&gt;shade&lt;/goal&gt;
      &lt;/goals&gt;
      &lt;configuration&gt;
        &lt;finalName&gt;twill-${project.version}-shaded&lt;/finalName&gt;
        &lt;relocations&gt;
          &lt;relocation&gt;
            &lt;pattern&gt;com.google.common&lt;/pattern&gt;
            &lt;shadedPattern&gt;shaded.com.google.common&lt;/shadedPattern&gt;
          &lt;/relocation&gt;
        &lt;/relocations&gt;
        &lt;artifactSet&gt;
          &lt;includes&gt;
            &lt;include&gt;*:*&lt;/include&gt;
          &lt;/includes&gt;
        &lt;/artifactSet&gt;
      &lt;/configuration&gt;
    &lt;/execution&gt;
  &lt;/executions&gt;
&lt;/plugin&gt;</code></pre><p>Note that I relocate all Guava classes to a new package name but NO code change is required. The class relocation is at binary level.</p><p>To use this custom built, shaded flat jar, we could use a local repository in our project:</p><pre><code>&lt;repositories&gt;
  &lt;repository&gt;
    &lt;id&gt;local-repo&lt;/id&gt;
    &lt;url&gt;file://${basedir}/lib&lt;/url&gt;
  &lt;/repository&gt;
&lt;/repositories&gt;</code></pre><p>and install the shaded flat jar into this local repository:</p><pre><code>mvn \
  org.apache.maven.plugins:maven-install-plugin:2.3.1:install-file \
  -Dfile=./twill-0.6.0-incubating-shaded.jar \
  -DgroupId=org.apache.twill \
  -DartifactId=twill \
  -Dversion=0.6.0-incubating-shaded \
  -Dpackaging=jar \
  -DlocalRepositoryPath=lib</code></pre><p>Finally, use the shaded dependency in our project:</p><pre><code>&lt;dependency&gt;
  &lt;groupId&gt;org.apache.twill&lt;/groupId&gt;
  &lt;artifactId&gt;twill&lt;/artifactId&gt;
  &lt;version&gt;0.6.0-incubating-shaded&lt;/version&gt;
&lt;/dependency&gt;</code></pre><p>If you look into the shaded flat jar files, you can see the relocated Guava classes.</p><pre><code># unzip -t lib/twill-0.6.0-incubating-shaded.jar | grep shaded
Archive:  lib/twill-0.6.0-incubating-shaded.jar
    testing: shaded/              OK
    testing: shaded/com/          OK
    testing: shaded/com/google/   OK
    testing: shaded/com/google/common/   OK
    testing: shaded/com/google/common/collect/   OK
    .
    .
    .</code></pre><p>Usually, that&#8217;s all we need to do. The Twill&#8217;s usage of Guava 13.0.1 is now renamed and isolated in that flat jar. But remember I said both Twill and Cassandra Java Driver exposes Guava classes to our code?</p><p>Because of this, whenever Guava is used for Twill (like passing Guava-type based objects as arguments), I have to use the package &#8220;shaded.com.google.common&#8221;. But when Guava is used for Cassandra Java Driver, I still use package &#8220;com.google.common&#8221;.</p><p>This does require code change though, a simple find-n-replace import package name. Not ideal, but it has to be like this.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Fault Tolerant Message Processing in Storm]]></title><description><![CDATA[Storm is fault tolerant and allows the choice of the level of guarantee with which messages to be processed:]]></description><link>https://www.bryantsai.com/p/fault-tolerant-message-processing-in-storm</link><guid isPermaLink="false">https://www.bryantsai.com/p/fault-tolerant-message-processing-in-storm</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Tue, 14 Oct 2014 07:00:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6ac2746f-e4d1-4b66-8d3b-d5bdaab38f22_671x449.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Storm is <a href="https://storm.apache.org/documentation/Fault-tolerance.html">fault tolerant</a> and allows the choice of the level of guarantee with which messages to be processed:</p><ol><li><p><strong>at-most-once</strong>: In this mode messages could be dropped if the processing fails or is timed out. This mode requires no special handling and the messages are processed in the order produced by spouts.</p></li><li><p><strong>at-least-once</strong>: This mode tracks whether each spout tuple is &#8220;fully&#8221; processed within a configured timeout. Any input tuple not fully processed within the timeout is re-emitted. This means the same tuple could be processed more than once and messages can be processed out-of-order. This mode does require user code to follow some &#8220;rules&#8221;, which are briefly described below.</p></li><li><p><strong>exactly-once</strong>: With <a href="http://storm.apache.org/documentation/Trident-tutorial.html">Trident</a>, Storm can provide &#8220;exactly-once&#8221; guarantee. This will not be discussed further in this post (maybe another in the future).</p></li></ol><p>If your application does require &#8220;at-least-once&#8221; guarantee, your topology code needs to do the following 3 things:</p><ol><li><p>When spouts emit tuples, specify a unique message ID. If you use <a href="https://storm.apache.org/documentation/Spout-implementations.html">spout implementation</a>, like <a href="https://github.com/apache/storm/tree/master/external/storm-kafka">storm-kafka</a> or <a href="https://github.com/nathanmarz/storm-kestrel">storm-kestrel</a>, they takes care of it so you don&#8217;t need to worry about it.</p></li><li><p>When bolts emit tuples, anchor them with input tuples.</p></li><li><p>When bolts are done processing the input tuple, ack or fail the input tuple.</p></li></ol><p>That&#8217;s it! If anything goes wrong, Storm would re-emit the failed spout tuples. (Actually it is the spout&#8217;s responsibility to re-emit tuples when its <code>fail</code> method is called. Luckily in most cases, we don&#8217;t have to write our own spout implementation.</p><p>So how does Storm implement this? Storm&#8217;s <a href="http://storm.apache.org/documentation/Guaranteeing-message-processing.html">implementation</a> is actually quite ingenious. Besides, I believe a little bit understanding of the internals is always a good thing. It helps the grasp of concepts.</p><p>I&#8217;ll use the following topology as the example, which is composed of 2 spouts and 2 bolts. Also, there are 2 <a href="https://storm.apache.org/documentation/Acking-framework-implementation.html">ackers</a> configured. Note that &#8220;sid1&#8221; and &#8220;sid2&#8221; are the IDs of the 2 spout tasks.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PiT8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PiT8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PiT8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PiT8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!PiT8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a35e2c9-c4a5-4a1c-bd08-710d0e1853a2_671x449.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>When a spout emits a tuple, it notifies a specific acker task about the new tuple&#8217;s ID and its own task ID. Because there could be multiple acker tasks, a simple mod hash function is used to determine which acker task to notify. In this case, spout task &#8220;sid1&#8221; emits tuple &#8220;stid1&#8221; and notifies Acker1 (mod-hash(stid1)=Acker1) with &#8220;[stid1, sid1]&#8221;. Upon receiving this info, Acker1 would create an entry for &#8220;stid1&#8221;. The bookkeeping entry contains 2 information: the originating spout task ID &#8220;sid1&#8221; and a so-called &#8220;ack val&#8221;. The &#8220;ack val&#8221; is initially set to be the spout tuple&#8217;s ID &#8220;stid1&#8221;, which is &#8220;1010&#8221; in binary form (we use 4-bit value for simplicity in this example. In reality, Storm uses 64-bit value). The same happens for the other spout&#8217;s generated tuple &#8220;stid2&#8221; (which is handled by Acker2).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kul_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kul_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!kul_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!kul_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!kul_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kul_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kul_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!kul_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!kul_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!kul_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47ffd713-3ea0-4ac9-8d9a-28c92794e2b9_671x449.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Bolt1 receives tuples &#8220;stid1&#8221; and &#8220;stid2&#8221; and then emits tuple &#8220;tid1&#8221; anchored to both (multi-anchoring). Since Bolt1 knows the IDs of both spout tuples received, it can easily determine the correct acker tasks to notify (using the mod-hash function). There are two input tuples, so Bolt1 needs to notify twice, one for each. For &#8220;stid1&#8221;, it notifies Acker1 with &#8220;[stid1, tid1]&#8221;. For &#8220;stid2&#8221;, it notifies Acker2 with &#8220;[stid2, tid1]&#8221;. Acker1 first looks up the entry for &#8220;stid1&#8221; to find out the value of its current &#8220;ack val&#8221;. Then it XOR the current &#8220;ack val&#8221; with the new tuple&#8217;s ID &#8220;tid1&#8221; and updates the &#8220;ack val&#8221; with the new XOR result. In this case, <code>1010 XOR 1100 = 0110</code>. The same process for the &#8220;stid2&#8221; and Acker2 updates its entry to be <code>1011 XOR 1100 = 0111</code>.</p><p>The use of XOR on all tuple IDs of a DAG (tuple tree) is ingenious. There could be thousands if not ten of thousands of tuples in a DAG and keeping track each individually is neither efficient nor scalable. This method only requires a fixed amount of memory (about 20 bytes per DAG) and is also extremely fast. Also, the XOR strategy does not rely on the ordering of messages received by ackers (see <a href="http://grokbase.com/t/gg/storm-user/131vnnt6j3/xor-vs-reference-counting#20130127uqc3d63otaaxeuimvp5ywwksvu">this</a>).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ANWh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ANWh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 424w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 848w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 1272w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ANWh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ANWh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 424w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 848w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 1272w, https://substackcdn.com/image/fetch/$s_!ANWh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2de7610-4bb5-40d6-a8c1-8c00e4486678_669x451.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>After Bolt1 completes the processing of both input tuples, it ack both and also notifies Acker1 and Acker2 the tuples acked. Acker1 updates the &#8220;ack val&#8221; of &#8220;stid1&#8221; to be <code>0110 XOR 1010 = 1100</code> and Acker2 updates the &#8220;ack val&#8221; of &#8220;stid2&#8221; to be <code>0111 XOR 1011 = 1100</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_Omp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_Omp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_Omp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_Omp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 424w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 848w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 1272w, https://substackcdn.com/image/fetch/$s_!_Omp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3224ed-0e3a-4129-b79c-1ff8658f4101_671x449.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Finally, the last Bolt2 in this topology receives tuple &#8220;tid1&#8221;. When it acks &#8220;tid1&#8221;, it also needs to notify acker tasks with this input tuple ID &#8220;tid1&#8221; along with the originating soput tuples&#8217; IDs. Storm always copies the originating spout tuple ID(s) into the emitted tuples so the originating spout tuple ID(s) is always available downstream. In this case, the received tuple &#8220;tid1&#8221; contains both &#8220;stid1&#8221; and &#8220;stid2&#8221;. With spout tuple IDs, Bolt2 can notify both Acker1 and Acker2 the final ack of tuple &#8220;tid1&#8221;. Both Acker1 and Acker2 then update the &#8220;ack val&#8221; of &#8220;stid1&#8221; and &#8220;stid2&#8221; to be <code>1100 XOR 1100 = 0000</code>.</p><p>At this point, both &#8220;ack val&#8221; of &#8220;stid1&#8221; and &#8220;stid2&#8221; are &#8220;0000&#8221;. When ackers see a zero &#8220;ack val&#8221;, it marks the originating spout tuple as completed and calls the <code>ack</code> method of the originating spout task.</p>]]></content:encoded></item><item><title><![CDATA[DB2 on Docker]]></title><description><![CDATA[Getting something running on Docker is no news nowadays. Docker makes it super easy to try new things quick and easy: ZooKeeper, Kafka&#8230;]]></description><link>https://www.bryantsai.com/p/db2-on-docker</link><guid isPermaLink="false">https://www.bryantsai.com/p/db2-on-docker</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Fri, 19 Sep 2014 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Getting something running on Docker is no news nowadays. Docker makes it super easy to try new things quick and easy: ZooKeeper, Kafka, Storm, piece of cake. As simple as pulling then running.</p><p>And there&#8217;s even <a href="http://www.fig.sh/">Fig</a>, which is just awesome!</p><p>So why do I want to write about running DB2 on Docker? It turns out there are still some dark areas.</p><p>Here it is: <a href="https://github.com/bryantsai/db2-docker]%28https://github.com/bryantsai/db2-docker%29,">https://github.com/bryantsai/db2-docker</a>, after burning several hours looking up here and there.</p><p>So what&#8217;s difficult about getting DB2 running on Docker?</p><h4>Docker Storage</h4><p>DB2 uses <code>O_DIRECT</code> flag which is not supported by the <a href="http://www.projectatomic.io/docs/filesystems/">aufs storage backend</a> used by Docker. There are two work-around: changing Docker to <a href="http://muehe.org/posts/switching-docker-from-aufs-to-devicemapper/">use devicemapper storage backend</a> or using volumes. See this <a href="https://github.com/jeffbonhag/db2-docker/issues/2#issuecomment-52920986">thread</a> for more backgrounds.</p><p>Requiring change of Docker storage backend is not really an ideal solution, as this is not a per-container option. It&#8217;s best to provide an image that anyone can use easily.</p><p>So that leaves the option of using <a href="http://docs.docker.com/userguide/dockervolumes/">volumes</a>. Essentially, volumes bypass aufs storage backend. There are several different ways of using volumes, and I found &#8220;Data Volume Container&#8221; (some called &#8220;data-only-container&#8221;) most suitable for DB2 container usage.</p><p>In our DB2 container, the complete <code>/home</code> is mounted on a volume from a &#8220;data-only-container&#8221; whose sole purpose is to export the <code>/home</code> volume. In fact, we don&#8217;t even need this &#8220;data-only-container&#8221; running, we just run it once and the volume exported would be useable by our DB2 container:</p><pre><code># this is how we &#8220;create&#8221; the data volume
# note the special usage of &#8220;true&#8221; as the command
&gt; docker run -i &#8212; name=db2_data_1 -v /home busybox true</code></pre><pre><code># even it is not running, the volume is usable by 
# other DB2 containers
&gt; docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fdcb78bdb340 busybox:buildroot-2014.02 &#8220;true&#8221; 6 hours ago Exited (0) 6 hours ago db2_data_1</code></pre><pre><code># launch a DB2 container using the volume
&gt; docker run &#8212; privileged=true -i -t &#8212; volumes-from=db2_data_1 &#8212; name=db2_inst_1 db2:expc</code></pre><p>In this way, the complete DB2 instance data including the database content are stored in that volume, separate from the DB2 container. We could start and stop the DB2 container freely and the database content would still be available.</p><p>This of course is much better and much portable than using local file system as the volume. You can also easily <a href="https://docs.docker.com/userguide/dockervolumes/#backup-restore-or-migrate-data-volumes">backup and restore</a> the volume content if you need.</p><p>This is also better than the in-container volume as you get to reuse the same volume across different runs.</p><h4>Building with&nbsp;Volumes</h4><p>There&#8217;s still one tricky part on using volume for DB2 instance data. When building an image from Dockerfile, there&#8217;s no way of using volumes. Yes, there&#8217;s a <code>Volume</code> command available in Dockerfile, but that is meant to be mounted when the image being built is actually ran. Not for build time!</p><p>In the end, we have to resort to using shell script to build the DB2 instance container in order to have the volume mounted. We actually build a base DB2 image using &#8220;compact&#8221; installation mode (no instance created) and then use a shell script to run this image with needed volume mounted:</p><pre><code>docker run -i &#8212; name=db2_data_1 -v /home busybox true</code></pre><pre><code>docker run &#8212; rm=true -i &#8212; volumes-from=db2_data_1 db2:expc /bin/bash &lt;&lt;EOF
 userdel dasusr1;userdel db2fenc1;userdel db2inst1;groupdel dasadm1;groupdel db2fgrp1;groupdel db2grp1
 groupadd db2grp1;groupadd db2fgrp1;groupadd dasadm1;useradd -g db2grp1 -m -d /home/db2inst1 db2inst1 -p db2inst1;useradd -g db2fgrp1 -m -d /home/db2fenc1 db2fenc1 -p db2fenc1;useradd -g dasadm1 -m -d /home/dasusr1 dasusr1 -p dasusr1
 /opt/ibm/db2/V10.5/instance/db2icrt -p $port -u db2fenc1 db2inst1
EOF</code></pre><p>When it ends, the DB2 instance is created and persisted in the volume from <code>db2_data_1</code>. We don&#8217;t need this &#8220;build&#8221; container anymore so it is removed right after completion (<code>--rm=true</code>).</p><p>Once the DB2 instance is created, we can launch a DB2 container and get into it:</p><pre><code>docker run &#8212; privileged=true &#8212; rm=true -i -t -P &#8212; volumes-from=db2_data_1 &#8212; name=db2_inst_1 db2:expc /bin/su -c &#8216;/home/db2inst1/sqllib/adm/db2start;/bin/bash&#8217; &#8212; db2inst1</code></pre><p>From there, you can create/use databases.</p><h4>Privileged Mode</h4><p>You would see the following error when starting DB2 manager if you don&#8217;t run the container in privileged mode:</p><pre><code>$ db2start
SQL1042C An unexpected system error occurred.</code></pre><p>According to <a href="https://github.com/jeffbonhag/db2-docker">https://github.com/jeffbonhag/db2-docker</a>,</p><blockquote><p>DB2 has a problem where it needs more shared memory than Docker originally provides.</p></blockquote><p>That&#8217;s why we add <code>--privileged=true</code> parameter when running it.</p><h4>Bonus Points</h4><p>I&#8217;m using Mac OS X and this is another major contributor of spending so much time on this.</p><p>Check out <a href="http://viget.com/extend/how-to-use-docker-on-os-x-the-missing-guide">How to Use Docker on OS X: The Missing Guide</a> if you use Docker on Mac OS X. There are at least two take-way.</p><p>On Mac OS X, because there&#8217;s a &#8220;boot2docker&#8221; VM serving as the Docker host, you cannot directly mount OS X local files or directories as volume. The normal Docker mount is meant for the Docker host, and on OS X it&#8217;s the &#8220;boot2docker&#8221; VM.</p><p>Luckily there&#8217;s a convenient solution, check out <a href="https://medium.com/boot2docker-lightweight-linux-for-docker/boot2docker-together-with-virtualbox-guest-additions-da1e3ab2465c">boot2docker together with VirtualBox Guest Additions</a>. With it, you can mount files and directories under <code>/Users</code> on OS X using Docker command directly.</p><p>The other take-way bonus point is to get you inside a running Docker container. What does that mean?</p><p>Docker containers are often ran in detached mode. Unless you enable sshd (be cautious and see <a href="http://jpetazzo.github.io/2014/06/23/docker-ssh-considered-evil/">this</a>), you cannot get into a running containers. Sometimes we just need to get into there. So how can we do that without sshd?</p><p>Create the following shell script and put it somewhere in your `PATH`:</p><pre><code>#!/bin/bash
set -e</code></pre><pre><code># Check for nsenter. If not found, install it
boot2docker ssh &#8216;[ -f /var/lib/boot2docker/nsenter ] || docker run &#8212; rm -v /var/lib/boot2docker/:/target jpetazzo/nsenter&#8217;</code></pre><pre><code># Use bash if no command is specified
args=$@
if [[ $# = 1 ]]; then
 args+=(/bin/bash)
fi</code></pre><pre><code>boot2docker ssh -t sudo /var/lib/boot2docker/docker-enter &#8220;${args[@]}&#8221;</code></pre><p>Then you can do some crazy stuff like these:</p><pre><code>&gt; docker-enter db2_inst_1 ps -A
 PID TTY TIME CMD
 1 ? 00:00:00 su
 8 ? 00:00:00 bash
 97 ? 00:00:00 db2syscr
 99 ? 00:00:01 db2sysc
 105 ? 00:00:00 db2syscr
 106 ? 00:00:00 db2syscr
 107 ? 00:00:00 db2syscr
 109 ? 00:00:00 db2vend
 117 ? 00:00:00 db2fmp
 118 ? 00:00:00 bash
 489 ? 00:00:00 ps</code></pre><pre><code>&gt; docker-enter db2_inst_1
% hostname
f4c1b9530fef</code></pre><p>How cool is that!</p>]]></content:encoded></item><item><title><![CDATA[Ruby expression ‘defined?’]]></title><description><![CDATA[Recently I got bitten by defined?. Here&#8217;s the gotcha (the condition always passes and never goes into else&#8230;):]]></description><link>https://www.bryantsai.com/p/ruby-expression-defined</link><guid isPermaLink="false">https://www.bryantsai.com/p/ruby-expression-defined</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Tue, 19 Aug 2014 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Recently I got bitten by <a href="http://ruby-doc.org/docs/keywords/1.9/Object.html#method-i-defined-3F"><code>defined?</code></a>. Here&#8217;s the gotcha (the condition always passes and never goes into else&#8230;):</p><pre><code>if defined? MyString &amp;&amp; obj.instance_of?(MyString)
 # &#8230;
else
 # &#8230;
end</code></pre><p>It turned out to be <a href="http://www.techotopia.com/index.php/Ruby_Operator_Precedence#Operator_Precedence_Table">operator precedence</a> (&amp;&amp; has higher precedence).</p><pre><code>obj = &#8216;str&#8217;</code></pre><pre><code>defined? String # =&gt; &#8220;constant&#8221;
defined? String &amp;&amp; obj.instance_of?(String) # =&gt; &#8220;expression&#8221;
defined?(String) &amp;&amp; obj.instance_of?(String) # =&gt; true</code></pre><pre><code># MyString is not defined yet
defined? MyString # =&gt; nil
defined? MyString &amp;&amp; obj.instance_of?(MyString) # =&gt; &#8220;expression&#8221;
defined?(MyString) &amp;&amp; obj.instance_of?(MyString) # =&gt; false</code></pre><p>The original condition is actually equal to <code>defined?(MyString &amp;&amp; obj.instance_of?(MyString))</code>. According to <a href="http://ruby-doc.org/docs/keywords/1.9/Object.html#method-i-defined-3F"><code>defined?</code></a>, the expression given to <code>defined?</code> is not executed, which means <code>MyString &amp;&amp; obj.instance_of?(MyString)</code> is always an &#8221;expression&#8221;.</p><p>That&#8217;s the price for not memorizing the operator precedence table&#8230; However, since <code>defined?</code> has such low precedence, I guess it&#8217;s easier to just always add parenthesis.</p>]]></content:encoded></item><item><title><![CDATA[Ruby Encoding]]></title><description><![CDATA[I still remember the frustration in the migration from Ruby 1.8 to 1.9 for Performance Analysis Suite. Most of the pain came from encoding&#8230;]]></description><link>https://www.bryantsai.com/p/ruby-encoding</link><guid isPermaLink="false">https://www.bryantsai.com/p/ruby-encoding</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Wed, 16 Jul 2014 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I still remember the frustration in the migration from Ruby 1.8 to 1.9 for <a href="https://perfanalyst.mybluemix.net/?utm_source=bryantsai&amp;utm_medium=link&amp;utm_campaign=blog">Performance Analysis Suite</a>. Most of the pain came from encoding problems.</p><p>Encoding in Ruby could drive you crazy if you have not dealt with it before or if you are coming from other languages like Java and Python. I found Yehuda Katz has the most comprehensive explanation on the encoding topic. You should read his <a href="http://yehudakatz.com/2010/05/17/encodings-unabridged/">Encodings, Unabridged</a> and <a href="http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/">Ruby 1.9 Encodings: A Primer and the Solution for Rails</a> first. If you ever want to know more background on encoding and Unicode, Joel Spolsky has the best treatment: <a href="http://www.joelonsoftware.com/articles/Unicode.html">The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a>.</p><p>It&#8217;s actually is not all that hard.</p><p>In Ruby (1.9+), each String has its own Encoding. This is different from other languages like Java and Python, which transcodes every String to the same Unicode. With all Strings using the same encoding, Strings can be freely mixed together by various operations (like concatenation). Developers don&#8217;t need to worry about encoding when dealing with Strings. Do you remember seeing encoding exception thrown from String operation in Java? This is not the case with Ruby:</p><pre><code>incompatible character encodings: ISO-8859&#8211;1 and UTF-8 (Encoding::CompatibilityError)</code></pre><p>The reason for this error is dead simple: ISO-8859&#8211;1 and UTF-8 encodings are not compatible and we just can&#8217;t mix them together (unless both contain only ASCII characters).</p><p>Note that it is a runtime error when two Strings with incompatible encoding come together. It is quite likely things work perfectly fine until some day it blows.</p><p>To prevent this error, we have to make sure never mix Strings of incompatible encoding together. One way is to manually transcode either one to the other&#8217;s encoding. <code>String.encode</code> makes it really easy to do that (more on this method&#8217;s usage later in this article).</p><p>An even easier way is to make Ruby behave like Java and Python, by default transcode all Strings to the same encoding:</p><pre><code>Encoding.default_internal = &#8216;UTF-8&#8217;</code></pre><p>(If you are still using 1.9, you also need to add magic comment <code># encoding: utf-8</code> in the first or second line of every source file. On 2.0 or later, the default source encoding is already UTF-8.)</p><p>Ruby, Ruby standard libraries, and most major libraries should already respect this option. But that&#8217;s just half of the story. If you are responsible for bringing Strings into Ruby, you have to make sure they are transcoded form the correct encoding and to the intended encoding:</p><pre><code># Encoding.default_internal = &#8216;UTF-8&#8217;
str = File.binread(&#8220;&#8230;&#8221;) # str.encoding is ASCII-8BIT
str.force_encoding(&#8220;SHIFT-JIS&#8221;).encode!</code></pre><p>The <code>force_encoding</code> tells Ruby to start using the given encoding for the String and then <code>encode!</code> re-encodes it from SHIFT-JIS to <code>Encoding.default_internal</code>. Note that <code>force_encoding</code> does not cause any re-encoding but merely re-tagged the String&#8217;s encoding. What this code snippet does is reading a file in, using SHIFT-JIS to decode it, then transcoding the read String to UTF-8. You could achieve the same effect with the following code:</p><pre><code># Encoding.default_internal = &#8216;UTF-8&#8217;
str = File.open(&#8220;&#8230;&#8221;, &#8220;r:SHIFT-JIS&#8221;).read</code></pre><p>That&#8217;s probably all you need to know about Ruby encoding. You also need to be careful of the source encoding when reading data in (via IO, File, etc.), but that is really not specific to Ruby.</p><p>Now, my experience from on dealing with a few encoding related problems.</p><h4>Dealing with Unknown&nbsp;Encoding</h4><p>Before you read in a file or receive some data from the internet, you have to know what encoding it uses in order to decode it. What if we just don&#8217;t know its encoding?</p><p>I&#8217;m not sure if there is a complete solution for auto-detecting encoding. At least not that I can find easily.</p><p>The first problem from <a href="https://perfanalyst.mybluemix.net/?utm_source=bryantsai&amp;utm_medium=link&amp;utm_campaign=blog">Performance Analysis Suite</a> was with DB2 snapshot text output, which is in <em>plain text</em> without encoding information. Most user databases are now using UTF-8 codeset, so UTF-8 encoding was chosen to read in the snapshots. The only problem with this is with non UTF-8 databases, the SQL queries in snapshots might contain literals incompatible to UTF-8. Either the literal contains illegal byte sequence in UTF-8 or the literal contains some non-ASCII characters which would look weird using UTF-8 encoding.</p><p>Since the DB2 snapshot statement tab has simulated statement concentration function, the actual literal values are not really critical to the performance analysis. It is therefore okay for our application to accept some <em>data loss</em>. So we remove all invalid characters from the SQL queries:</p><pre><code>str = File.binread(&#8220;&#8230;&#8221;) # str.encoding is ASCII-8BIT
str.encode!(&#8216;UTF-8&#8217;, &#8216;ASCII-8BIT&#8217;, invalid: :replace, undef: :replace, replace: &#8216;&#8217;)</code></pre><p>Any character from the source encoding ASCII-8BIT that can not be transcoded to UTF-8 would be removed (replaced by empty String). In the end, you get the snapshot text output content read in as UTF-8 String.</p><h4>Using Binary when You Need&nbsp;Binary</h4><p>Another problem was related to the dumping of <em>possible</em> binary data to YAML file. Some bitwise operation was applied on each character and then Strings are persisted by <code>YAML.dump</code>. We used to apply bitwise operation directly on the String with UTF-8 encoding. This caused a lot of pain as there&#8217;s no reliable way to read it back.</p><p>We should have explicitly used <code>force_encoding(&#8220;ASCII-8BIT&#8221;)</code> before applying bitwise operation:</p><pre><code># change to binary before applying bitwise operation
str.force_encoding(&#8216;ASCII-8BIT&#8217;)
result = &#8216;&#8217;.force_encoding(&#8216;ASCII-8BIT&#8217;)</code></pre><pre><code># bitwise operation
str.size.times {|i| result &lt;&lt; bitwise(str[i].ord, decoding)}</code></pre><pre><code># change back to UTF-8 if decoding
result.force_encoding(&#8216;UTF-8&#8217;) if decoding
{% endhighlight %}</code></pre><p>When decoding, after bitwise operation we just tag the encoding back to UTF-8.</p>]]></content:encoded></item><item><title><![CDATA[Don’t Fight Maven]]></title><description><![CDATA[I&#8217;m not here to tell why I don&#8217;t like Maven or what&#8217;s wrong with Maven. Google around and you&#8217;ll easily find many. I just want to share a&#8230;]]></description><link>https://www.bryantsai.com/p/dont-fight-maven</link><guid isPermaLink="false">https://www.bryantsai.com/p/dont-fight-maven</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Sat, 07 Jun 2014 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!b2Hk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!b2Hk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!b2Hk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 424w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 848w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!b2Hk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png" width="1456" height="601" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:601,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4249716,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!b2Hk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 424w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 848w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!b2Hk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ec3bc4e-bc87-435b-a966-c212a1f22e7b_2880x1188.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>I&#8217;m not here to tell why I don&#8217;t like Maven or what&#8217;s wrong with Maven. Google around and you&#8217;ll easily find many. I just want to share a few tidbits about Maven.</p><p>So here&#8217;s My typical relationship with Maven:</p><ol><li><p>When starting a new project, get an old POM from previous projects.</p></li><li><p>Everything&#8217;s fine, great! Leave it alone. Done.</p></li><li><p>Okay, something&#8217;s new. Google solutions&#8230;</p></li><li><p>Most of the time, there <em>ARE</em> solutions to my new build requirements. But strangely, there are also <em>ALWAYS</em> some tweaking needed.</p></li><li><p>Done. Leave it alone.</p></li><li><p>Until next time&#8230;</p></li></ol><p>It&#8217;s especially interesting how often the above 3rd and 4th points happen. When using Maven, I found myself constantly need Google (and StackOverflow). That should say something already.</p><p>Anyway, I&#8217;ve long since learned my lesson (and you probably have already read this somewhere):</p><blockquote><p>Don&#8217;t fight Maven</p></blockquote><h4>Multi-Module Projects</h4><p>Maven <em>believes</em> the complete project, including all modules, should be located in a single repository. Modules are put in sub-directories.</p><p>&#8220;That can&#8217;t be right&#8221;, you said. &#8220;We have to put each module in its own repository.&#8221;</p><p>It&#8217;s a good and valid cry. But usually it ends up spending many hours hacking around.</p><p>Remember: don&#8217;t fight Maven.</p><p>One way is to use Git submodules and to create a new <em>aggregate</em> parent project. Maven would then take care of the module dependencies appropriately. Also, with a little change, you don&#8217;t really need an extra <em>aggregate</em> project simply for Maven.</p><p>Let&#8217;s use a very simple example: a web project A uses jar project B. A and B both are put in their own Git repository. How should we use Maven for them?</p><p>We can use a <em>aggregate</em> multi-module POM as the parent which includes both A and B as modules. This <em>aggregate</em> POM is put in A (named <code>pom.xml</code>), as it would be the starting point for A. The POM of the actual A (the web project), named <code>pom_A.xml</code>, is also put in project A along with <code>pom.xml</code>. <code>pom_A.xml</code> is the real deal, responsible for building web project A itself.</p><p>Project B is then added as a Git submodule in A, under sub-directory <code>lib/B</code>. In <code>pom_A.xml</code>, we have declared the dependency on B as usual. And that&#8217;s it.</p><pre><code>&lt;! &#8212; pom.xml &#8594;
&lt;groupId&gt;&#8230;&lt;/groupId&gt;
&lt;artifactId&gt;&#8230;&lt;/artifactId&gt;
&lt;version&gt;1.0.0-SNAPSHOT&lt;/version&gt;
&lt;name&gt;&#8230;&lt;/name&gt;
&lt;packaging&gt;pom&lt;/packaging&gt;
&lt;modules&gt;
 &lt;module&gt;./pom_A.xml&lt;/module&gt;
 &lt;module&gt;lib/B&lt;/module&gt;
&lt;/modules&gt;
&lt;build&gt;
 &lt;plugins&gt;
 &lt;plugin&gt;
 &lt;artifactId&gt;maven-clean-plugin&lt;/artifactId&gt;
 &lt;version&gt;2.5&lt;/version&gt;
 &lt;configuration&gt;
 &lt;skip&gt;true&lt;/skip&gt;
 &lt;/configuration&gt;
 &lt;/plugin&gt;
 &lt;/plugins&gt;
&lt;/build&gt;
&#8230;</code></pre><pre><code>&lt;! &#8212; pom_A.xml &#8594;
&lt;groupId&gt;&#8230;&lt;/groupId&gt;
&lt;artifactId&gt;A&lt;/artifactId&gt;
&lt;version&gt;1.0.0-SNAPSHOT&lt;/version&gt;
&lt;name&gt;A&lt;/name&gt;
&lt;packaging&gt;war&lt;/packaging&gt;
&lt;dependencies&gt;
 &lt;dependency&gt;
 &lt;groupId&gt;&#8230;&lt;/groupId&gt;
 &lt;artifactId&gt;B&lt;/artifactId&gt;
 &lt;version&gt;1.0.0-SNAPSHOT&lt;/version&gt;
 &lt;/dependency&gt;
&#8230;</code></pre><pre><code>&lt;! &#8212; pom_B.xml &#8594;
&lt;groupId&gt;&#8230;&lt;/groupId&gt;
&lt;artifactId&gt;B&lt;/artifactId&gt;
&lt;version&gt;1.0.0-SNAPSHOT&lt;/version&gt;
&lt;name&gt;B&lt;/name&gt;
&lt;packaging&gt;jar&lt;/packaging&gt;
&#8230;</code></pre><p>One caveat is the skipping of clean phase of the <em>aggregate</em> POM itself. Without this extra configuration, if you use <code>mvn clean package</code> to build, you&#8217;d be surprised to see after A and B are built successfully, the build artifacts are cleaned immediately! Like I said, always tweaking&#8230;</p><p>One more thing is to remember to run <code>git submodule update --init --recursive</code> first. I probably should Google how to include this into Maven as well&#8230;</p><h4>Dependencies Not In Any Maven Repository</h4><p>How should a private library be handled in a Maven build?</p><p>One choice would be to setup an internal Maven repository. If that is overkill, use a local repository in the project and have the private library also checked in.</p><p>First install the private library into sub-directory <code>lib</code> (which will serve as the local Maven repository). Add the created repository files/sub-directories into SCM.</p><p>Then add to POM an extra repository pointing to sub-directory <code>lib</code>. Declare the usual dependency. Done.</p><pre><code>&lt;! &#8212; install_lib.sh &#8594;
mvn org.apache.maven.plugins:maven-install-plugin:2.3.1:install-file \
 -Dfile=./cglib-nodep.jar \
 -DgroupId=cglib \
 -DartifactId=cglib-nodep \
 -Dversion=2.2.3 \
 -Dpackaging=jar \
 -DlocalRepositoryPath=lib</code></pre><pre><code>&lt;! &#8212; pom.xml &#8594;
&lt;dependencies&gt;
 &lt;dependency&gt;
 &lt;groupId&gt;cglib&lt;/groupId&gt;
 &lt;artifactId&gt;cglib-nodep&lt;/artifactId&gt;
 &lt;version&gt;2.2.3&lt;/version&gt;
 &lt;/dependency&gt; 
 &#8230;
&lt;/dependencies&gt;
&lt;repositories&gt;
 &lt;repository&gt;
 &lt;id&gt;local-repo&lt;/id&gt;
 &lt;url&gt;file://${basedir}/lib&lt;/url&gt;
 &lt;/repository&gt;
&lt;/repositories&gt;
&#8230;</code></pre><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.bryantsai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Bryan Tsai! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[HTTP Caching]]></title><description><![CDATA[Caching in HTTP can be tricky sometimes. Getting it right following the spec is not all that difficult, but in reality, different browsers&#8230;]]></description><link>https://www.bryantsai.com/p/http-caching</link><guid isPermaLink="false">https://www.bryantsai.com/p/http-caching</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Sun, 27 Apr 2014 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Caching in HTTP can be tricky sometimes. Getting it right following the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">spec</a> is not all that difficult, but in reality, different browsers and versions usually upset us.</p><p>Browsing through <a href="http://stackoverflow.com/search?q=http+cache">stackoverflow</a>, you could easily find many having the same struggle. We just can&#8217;t or don&#8217;t have time to figure out all the edge cases ourselves.</p><p>So here are the hard and fast rules I figure out and will follow in the future:</p><h4>Static Resources</h4><p>Content that never change: JS and CSS files, images, and any kind of binary files all fall in this category.</p><p>By <em><strong>never</strong></em>, I really mean never. It&#8217;s common best practice to <em>versionize</em> static resources. Whenever they change, so do their URL change.</p><p>Here are the simple rules for static resources:</p><ul><li><p>Embed fingerprint in either the file name or path. Avoid using query string for the fingerprint. Also, ensure the generated URLs differ on more than 8-character boundaries.</p></li><li><p>Use these HTTP headers:</p></li></ul><pre><code>Cache-Control: public, max-age=31536000
Expires: (a year from now)
ETag: (based on content)
Last-Modified: (some time in the past)
Vary: Accept-Encoding</code></pre><p>It&#8217;s that simple for static resources.</p><h4>Dynamic Resources</h4><p>Depending on the application&#8217;s requirement on freshness and privateness, different cache control setting should be used.</p><p>For non-private and constantly changing resources (think of stock ticker), the following could be used:</p><pre><code>Cache-Control: public, max-age=0
Expires: (now)
ETag: (based on content)
Last-Modified: (some time in the past)
Vary: Accept-Encoding</code></pre><p>The effect is that the resource could be cached publicly (by browsers as well as by proxy servers). Each time before browsers use the resource, they would check whether there&#8217;s a newer version and download it if there is.</p><p>Note that with this, browsers have some flexibility on revalidation. Typically when users click back/forward buttons, browsers do not revalidate but instead just use the cached version. If you&#8217;d like more strict control, say browsers must revalidate even when clicking back/forward buttons, use:</p><pre><code>Cache-Control: public, no-cache, no-store</code></pre><p>Not all dynamic resources are to become stale right way. If they can be fresh for at least 5 minutes, use:</p><pre><code>Cache-Control: public, max-age=300</code></pre><p>With this, browsers only revalidate after 5 minutes. Before that, cached content is used directly. If strict control over staleness is also required after 5 minutes, you can add <code>must-revalidate</code>:</p><pre><code>Cache-Control: public, max-age=300, must-revalidate</code></pre><p>For private or per-user content, replace <code>public</code> with <code>private</code> to avoid the content being cached by proxies:</p><pre><code>Cache-Control: private, &#8230;</code></pre><h4>Cache-Control and&nbsp;Expires</h4><p>When both <code>Cache-Control</code> and <code>Expires</code> are used, <code>Cache-Control</code> takes precedence.</p><p>Using both <code>Cache-Control</code> and <code>Expires</code> is meant to gain wider support (by different browsers and versions). Of course, they should be configured to mean the same freshness to avoid any confusion.</p><p>See <a href="http://squid-web-proxy-cache.1019090.n4.nabble.com/Expires-vs-Cache-Control-max-age-td1033350.html">Expires: vs. Cache-Control: max-age</a>.</p><h4>ETag and Last-Modified</h4><p>These headers are used when browsers do revalidation. Basically, browsers just blindly store the values of these headers received from the server, and later when validating, browsers send conditional request with these values to the server (via headers <code>If-None-Match</code> and <code>If-Modified-Since</code>, respectively).</p><p>Note that validation only occurs after the resource expired.</p><p>It is up to the server when both header <code>If-None-Match</code> and <code>If-Modified-Since</code> are present in conditional requests. However, since it is the server generates <code>ETag</code> and/or <code>Last-Modified</code>, in practice, there&#8217;s not much problem. Most browsers do send both if available.</p><p>See <a href="http://stackoverflow.com/questions/824152/what-takes-precedence-the-etag-or-last-modified-http-header">What takes precedence: the ETag or Last-Modified HTTP header?</a></p><p>One frequent suggestion is to avoid the use of <code>ETag</code>. This is not always a valid suggestion. <code>ETag</code> indeed provides more precise control on whether content is really changed. The default Apache method for generating <code>ETag</code> takes file inode, size, and last modified date time as input. This makes the generated <code>ETag</code> value pretty useless in a load balanced environment, because each server will generate a different <code>ETag</code> value for the same file. This is probably the only issue that causes a lot of people to disable <code>ETag</code> completely, which is not really necessary as long as a single unique <code>ETag</code> value is generated for exactly matching file content.</p><p>See <a href="https://www.techpunch.co.uk/development/should-your-site-be-using-etags-or-not">Should your site be using etags or not?</a></p><h4>Manually hitting&nbsp;Ctrl-R</h4><p>When hitting Ctrl-R, browsers send request with the following headers to check if it needs to refresh the cache content:</p><pre><code>Cache-Control: max-age=0
If-None-Match: &#8230;
If-Modifed-Since: &#8230;</code></pre><p>Note that this is not really just talking to the original server, but meant for any proxy servers along the way. Essentially it revalidates the content. If 304 is replied, browser uses the cached content.</p><h4>Vary: Accept-Encoding</h4><p>This header might be unfamiliar to some.</p><p>When a resource is gzip compression enabled and is cached by proxy servers, clients not supporting gzip compression would get incorrect data (that is, compressed) without this. It instructs the proxy servers to cache two versions of the resource: one compressed, and one uncompressed. The correct version of the resource is delivered based on the request header.</p><p>Another reason is the reality: Internet Explorer does not cache any resources that are served with the <code>Vary</code> header and any fields but <code>Accept-Encoding</code> and <code>User-Agent</code>. So adding this header in exactly this way is to ensure these resources are cached by IE.</p><ul><li><p><a href="https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers">Increasing Application Performance with HTTP Cache Headers</a></p></li><li><p><a href="http://tomayko.com/writings/things-caches-do">Things Caches Do</a></p></li><li><p><a href="https://developers.google.com/speed/articles/caching">Google Developers: HTTP Caching</a></p></li><li><p><a href="https://developers.google.com/speed/docs/best-practices/caching?csw=1">Google Developers: Optimize Caching</a></p></li><li><p><a href="http://www.mnot.net/cache_docs/">Caching Tutorial for Web Authors and Webmasters</a></p></li><li><p><a href="http://palizine.plynt.com/issues/2008Jul/cache-control-attributes/">Cache Control Directives Demystified</a></p></li><li><p><a href="http://webmasters.stackexchange.com/questions/1459/what-are-the-hard-and-fast-rules-for-cache-control?lq=1">What are the hard and fast rules for Cache Control?</a></p></li><li><p><a href="http://stackoverflow.com/questions/1046966/whats-the-difference-between-cache-control-max-age-0-and-no-cache">What&#8217;s the difference between Cache-Control: max-age=0 and no-cache?</a></p></li><li><p><a href="http://stackoverflow.com/questions/2932890/http-cache-control-max-age-must-revalidate">HTTP Cache Control max-age, must-revalidate</a></p></li><li><p><a href="http://stackoverflow.com/questions/18148884/difference-between-no-cache-and-must-revalidate?rq=1">Difference between no-cache and must-revalidate</a></p></li><li><p><a href="http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx">Caching Improvements in Internet Explorer 9</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[車險要怎麼保?]]></title><description><![CDATA[&#25033;&#35442;&#27794;&#26377;&#20154;&#36023;&#36554;&#19981;&#20445;&#36554;&#38570;&#30340;&#21543;? (&#21487;&#20197;&#21966;?) &#20063;&#25033;&#35442;&#27794;&#26377;&#20154;&#35258;&#24471;&#36554;&#38570;&#19981;&#36020;&#21543;? (&#26377;&#21966;?) &#19981;&#36942;&#27599;&#24180;&#26178;&#38291;&#19968;&#21040;&#65292;&#24448;&#24448;&#23601;&#22312;&#26178;&#38291;&#33287;&#26989;&#21209;&#21729;&#30340;&#38617;&#37325;&#22739;&#21147;&#19979;&#65292;&#38568;&#38568;&#20415;&#20415;&#23601;&#20184;&#20102;&#37666;&#65292;&#21839;&#38988;&#26159;&#20320;&#30495;&#30340;&#26296;&#35299;&#21040;&#24213;&#20445;&#20102;&#20160;&#40636;&#27171;]]></description><link>https://www.bryantsai.com/p/auto-insurance</link><guid isPermaLink="false">https://www.bryantsai.com/p/auto-insurance</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Mon, 03 Dec 2007 08:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#25033;&#35442;&#27794;&#26377;&#20154;&#36023;&#36554;&#19981;&#20445;&#36554;&#38570;&#30340;&#21543;? (&#21487;&#20197;&#21966;?) &#20063;&#25033;&#35442;&#27794;&#26377;&#20154;&#35258;&#24471;&#36554;&#38570;&#19981;&#36020;&#21543;? (&#26377;&#21966;?) &#19981;&#36942;&#27599;&#24180;&#26178;&#38291;&#19968;&#21040;&#65292;&#24448;&#24448;&#23601;&#22312;&#26178;&#38291;&#33287;&#26989;&#21209;&#21729;&#30340;&#38617;&#37325;&#22739;&#21147;&#19979;&#65292;&#38568;&#38568;&#20415;&#20415;&#23601;&#20184;&#20102;&#37666;&#65292;&#21839;&#38988;&#26159;&#20320;&#30495;&#30340;&#26296;&#35299;&#21040;&#24213;&#20445;&#20102;&#20160;&#40636;&#27171;&#30340;&#36554;&#38570;&#21966;? &#36996;&#26377;&#65292;&#20320;&#30906;&#23450;&#20445;&#36889;&#20491;&#23565;&#21966;&#12289;&#20445;&#36889;&#27171;&#22816;&#21966;?</p><p>&#19968;&#27171;&#65292;&#24597;&#20320;&#30475;&#19981;&#19979;&#21435;&#65292;&#20808;&#20358;&#35611;&#19968;&#19979;&#25105;&#30340;&#32080;&#35542;:</p><blockquote><p>&#36023;&#36554;&#38570;&#19968;&#23450;&#35201;&#20808;&#36023;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#65292;&#32780;&#19988;&#33267;&#23569;&#35201;&#36023;&#21040; 200 &#33836;&#20197;&#19978;&#65292;&#26377;&#20102;&#20197;&#24460;&#22914;&#26524;&#36996;&#26377;&#38928;&#31639;&#65292;&#25165;&#21435;&#32771;&#24942;&#36554;&#39636;&#38570;&#21450;&#31434;&#30428;&#38570;&#12290;</p></blockquote><p>&#36889;&#35041;&#30340;&#30446;&#30340;&#19981;&#26159;&#35201;&#30465;&#37666;&#65292;&#25152;&#20197;&#25105;&#23601;&#19981;&#24324;&#19968;&#22534;&#35430;&#31639;&#34920;&#20102;&#65292;&#21482;&#26159;&#19968;&#40670;&#30740;&#31350;&#36942;&#24460;&#30340;&#24515;&#24471;&#22577;&#21578;&#12290;</p><p>&#39318;&#20808;&#65292;&#36554;&#38570;&#26377;&#24456;&#22810;&#32048;&#38917;&#65292;&#33267;&#26044;&#28858;&#20160;&#40636;&#35201;&#25630;&#25104;&#36889;&#27171;? &#20320;&#21839;&#25105;&#65292;&#25105;&#20063;&#19981;&#30693;&#36947;&#35201;&#21839;&#35504;&#65292;&#21453;&#27491;&#25972;&#20491;&#23601;&#26159;&#19968;&#20491;&#33707;&#21517;&#20854;&#22937;&#23601;&#23565;&#20102;! &#19981;&#36942;&#19981;&#31649;&#24590;&#27171;&#65292;&#35201;&#30693;&#36947;&#36067;&#30340;&#26159;&#20160;&#40636;&#34277;&#33167;&#65292;&#36996;&#24471;&#20808;&#25630;&#25026;&#36889;&#33899;&#34310;&#35041;&#21040;&#24213;&#35037;&#20102;&#20160;&#40636;&#25165;&#34892;&#12290;</p><h4>&#24375;&#21046;&#36012;&#20219;&#38570;</h4><p>&#36889;&#20491;&#26159;&#25919;&#24220;&#35215;&#23450;&#19968;&#23450;&#24471;&#20445;&#65292;&#32780;&#19988;&#27794;&#20445;&#35686;&#23519;&#26371;&#38283;&#32624;&#21934;&#30340;[&#185;]&#12290;&#36889;&#20491;&#20445;&#38570;&#30340;&#30446;&#30340;&#30070;&#28982;&#26159;&#24076;&#26395;&#25152;&#26377;&#20132;&#36890;&#20107;&#25925;&#30340;&#21463;&#23475;&#32773;&#37117;&#33021;&#21463;&#21040;&#20445;&#38556; (&#19981;&#31649;&#26159;&#38283;&#36554;&#30340;&#12289;&#22352;&#36554;&#30340;&#12289;&#29978;&#33267;&#26159;&#22312;&#36554;&#23376;&#22806;&#38754;&#19981;&#24184;&#21463;&#21040;&#27874;&#21450;&#34987;&#25758;&#30340;) &#12290;&#31777;&#21934;&#20358;&#35498;&#65292;&#20219;&#20309;&#20154;&#21482;&#35201;&#30332;&#29983;&#20132;&#36890;&#20107;&#25925;&#32780;&#21463;&#20663;&#65292;&#23601;&#21487;&#20197;&#21040;&#20219;&#20309;&#19968;&#23478;&#26377;&#25215;&#20445;&#24375;&#21046;&#36012;&#20219;&#38570;&#30340;&#20445;&#38570;&#20844;&#21496;&#30003;&#35531;&#29702;&#36064;&#65292;&#21363;&#20351;&#21482;&#26159;&#36208;&#22312;&#36335;&#19978;&#19981;&#24184;&#34987;&#21029;&#20154;&#38283;&#36554;&#32102;&#25758;&#21040;&#65292;&#25110;&#26159;&#32903;&#20107;&#32773;&#30340;&#36554;&#27794;&#26377;&#20219;&#20309;&#20445;&#38570;&#65292;&#29978;&#33267;&#32903;&#20107;&#32773;&#26681;&#26412;&#27794;&#26377;&#37666;&#20358;&#36064;&#20320;&#12290;</p><p>&#26082;&#28982;&#36889;&#26159;&#24375;&#21046;&#24615;&#36074;&#30340;&#65292;&#23427;&#30340;&#20445;&#38556;&#22522;&#26412;&#19978;&#19981;&#26371;&#22826;&#22810;[&#178;]&#65292;&#32780;&#19988;&#23427;&#21482;&#20445;&#20154;&#39636;&#21463;&#20663;&#30340;&#37096;&#20221;&#65292;&#20006;&#19981;&#21253;&#25324;&#36001;&#29289;&#25613;&#23475; (&#27604;&#22914;&#35498;&#20462;&#36554;&#36027;&#23601;&#19981;&#34892;)&#12290;&#23427;&#30340;&#29702;&#36064;&#25033;&#35442;&#30475;&#20316;&#26159;&#19968;&#31278;&#26368;&#26368;&#26368;&#22522;&#26412;&#30340;&#20445;&#38556;&#65292;&#19968;&#23450;&#35201;&#20877;&#25645;&#37197;&#20854;&#20182;&#30340;&#20445;&#38570;&#25165;&#22816;&#65292;&#21315;&#33836;&#21029;&#20197;&#28858;&#21482;&#20445;&#20102;&#36889;&#20491;&#23601;&#19968;&#20999;&#25630;&#23450; (&#27794;&#36889;&#40636;&#31777;&#21934;)&#12290;</p><p>&#24375;&#21046;&#36012;&#20219;&#38570;&#30340;&#22909;&#34389;&#22312;&#26044;&#65292;&#21999;&nbsp;&#8230; &#23427;&#26159;&#24375;&#21046;&#30340;&nbsp;&#8230; &#20063;&#23601;&#26159;&#35498;&#27599;&#19968;&#36635;&#36554;&#37117;&#24471;&#20445;&#65292;&#22240;&#27492;&#29702;&#35542;&#19978;&#27599;&#19968;&#20214;&#20132;&#36890;&#20107;&#25925;&#30340;&#21463;&#23475;&#20154;&#37117;&#26371;&#21463;&#21040;&#20445;&#38556; (&#33267;&#23569;&#26159;&#36319;&#36554;&#23376;&#26377;&#38364;&#30340;)&#65292;&#20107;&#23526;&#19978;&#20063;&#26159;&#22914;&#27492;&#65292;&#22240;&#28858;&#23601;&#31639;&#32903;&#20107;&#32773; &#8220;&#22909;&#33213;&#8221; &#27794;&#20445;&#65292;&#29978;&#33267;&#26159;&#22909;&#27515;&#19981;&#27515;&#36935;&#21040;&#32903;&#20107;&#32773;&#36867;&#36920;&#65292;&#25105;&#20497;&#36996;&#26159;&#21487;&#20197;&#29554;&#24471;&#36889;&#37096;&#20221;&#30340;&#29702;&#36064; (&#25563;&#21477;&#35441;&#35498;&#65292;&#36889;&#26159;&#25919;&#24220;&#32102;&#25105;&#20497;&#30340;&#19968;&#31278;&#20445;&#38556;)&#12290;&#38364;&#26044;&#24375;&#21046;&#36012;&#20219;&#38570;&#65292;&#26377;&#19968;&#20123;&#27604;&#36611;&#38656;&#35201;&#27880;&#24847;&#30340;&#38917;&#30446;&#65292;&#27604;&#22914;&#35498;&#22312;&#8221;&#19968;&#36554;&#36635;&#20107;&#25925;&#8221;[&#179;]&#20013; &#8220;&#39381;&#39387;&#20154;&#8221; &#21463;&#20663;&#26159;&#19981;&#29702;&#36064;&#30340; (&#22909;&#27604;&#35498;&#26032;&#25163;&#19978;&#36335;&#65292;&#32080;&#26524;&#38283;&#21435;&#25758;&#36335;&#29128;&#32780;&#21463;&#20663;)&#65292;&#36889;&#37096;&#20221;&#21487;&#33021;&#23601;&#38656;&#35201;&#21478;&#22806;&#20445;&#20491;&#24847;&#22806;&#38570;&#25110;&#26159;&#39381;&#39387;&#20154;&#20663;&#23475;&#38570;&#20358;&#35036;&#24375;&#19968;&#19979;&#12290;&#36996;&#26377;&#65292;&#20854;&#26412;&#19978;&#19981;&#31649;&#25105;&#20497;&#26159;&#21463;&#23475;&#32773;&#36996;&#26159;&#32903;&#20107;&#32773;&#65292;&#21482;&#35201;&#26377;&#21463;&#20663;&#23601;&#21487;&#20197;&#29554;&#24471;&#24375;&#21046;&#38570;&#30340;&#20445;&#38556;&#65292;&#19981;&#36942;&#26377;&#19968;&#20123;&#20363;&#22806;&#24773;&#24418;&#65292;&#20687;&#37202;&#39381;&#12289;&#28961;&#29031;&#39381;&#39387;&#31561;[&#8308;]&#65292;&#20445;&#38570;&#20844;&#21496;&#20107;&#24460;&#26159;&#26371;&#21521;&#32903;&#20107;&#32773;&#27714;&#20767;&#30340; (&#20063;&#23601;&#26159;&#35498;&#22914;&#26524;&#20320;&#20098;&#25630;&#65292;&#25265;&#27465;&#65292;&#35531;&#33258;&#24049;&#21435;&#36064;&#23565;&#26041;)&#12290;</p><p>&#36023;&#36889;&#20491;&#38570;&#27794;&#21861;&#22909;&#36984;&#30340;&#65292;&#27861;&#24459;&#35215;&#23450;&#19968;&#23450;&#35201;&#20445;&#65292;&#32780;&#19988;&#20445;&#36027;&#20063;&#26159;&#25919;&#24220;&#32113;&#19968;&#23450;&#20729;[&#8309;]&#65292;&#22522;&#26412;&#19978;&#36319;&#21738;&#19968;&#38291;&#20445;&#38570;&#20844;&#21496;&#36023;&#37117;&#19968;&#27171;&#12290;&#24375;&#21046;&#36012;&#20219;&#38570;&#30340;&#20445;&#36027;&#36319;&#24180;&#32000;&#12289;&#24615;&#21029;&#12289;&#36996;&#26377;&#32903;&#20107;&#32000;&#37636;&#37117;&#26377;&#38364;&#20418; (&#32000;&#37636;&#26159;&#36319;&#33879;&#20154;&#30340;)&#65292;&#19981;&#36942;&#23427;&#22312;&#36554;&#38570;&#30070;&#20013;&#31639;&#26159;&#30456;&#23565;&#20415;&#23452;&#30340;&#65292;&#36890;&#24120;&#20108;&#21315;&#20803;&#19978;&#19979;&#23601;&#24046;&#19981;&#22810;&#20102;&#65292;&#32780;&#19988;&#21482;&#35201;&#19968;&#24180;&#20839;&#27794;&#26377;&#32903;&#20107;&#32000;&#37636;&#23601;&#21487;&#20197;&#25240; 18% &#30340;&#20445;&#36027;&#65292;&#36899;&#32396;&#20841;&#24180;&#27794;&#26377;&#21063;&#21487;&#20197;&#25240; 26%&#65292;&#26368;&#39640;&#36899;&#32396;&#19977;&#24180;&#27794;&#26377;&#32903;&#20107;&#32000;&#37636;&#21487;&#20197;&#25240; 30%&#12290;&#20294;&#26159;&#21453;&#36942;&#20358;&#35498;&#65292;&#22914;&#26524;&#19977;&#24180;&#20839;&#26377;&#20219;&#20309;&#32903;&#20107;&#32000;&#37636;&#30340;&#35441; (&#32000;&#37636;&#26371;&#32047;&#35336;)&#65292;&#26368;&#39640;&#21487;&#33021;&#24471;&#22810;&#20184; 60% &#30340;&#24375;&#21046;&#38570;&#20445;&#36027;&#21734;!</p><h4>&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;</h4><p>&#24375;&#21046;&#36012;&#20219;&#38570;&#21482;&#20445;&#20154;&#39636;&#21463;&#20663;&#30340;&#37096;&#20221;&#65292;&#32780;&#19988;&#20445;&#38989;&#26368;&#39640;&#27599;&#19968;&#20301;&#21463;&#23475;&#20154;&#21482;&#26377; 20 &#33836; (&#21463;&#20663;)&#12289;150 &#33836; (&#27515;&#20129;&#25110;&#27544;&#24290;)&#65292;&#20197;&#30446;&#21069;&#19968;&#33324;&#30340;&#23526;&#38555;&#26696;&#20363;&#20358;&#35611;[&#8310;]&#65292;&#27861;&#38498;&#21028;&#36064;&#37117;&#22312; 300~500 &#33836;&#20043;&#38291;&#65292;&#24375;&#21046;&#36012;&#20219;&#38570;&#20854;&#23526;&#26159;&#19981;&#22816;&#30340;&#12290;&#19981;&#22816;&#24590;&#40636;&#36774;? &#19981;&#22816;&#23601;&#24471;&#33258;&#24049;&#21478;&#22806;&#20184;&#21834;! &#24819;&#24819;&#30475;&#65292;&#20986;&#20491;&#24847;&#22806;&#24050;&#32147;&#22816;&#20498;&#26979;&#20102;&#65292;&#33836;&#19968;&#19981;&#24184;&#34987;&#21028;&#23450;&#28858;&#32903;&#20107;&#32773;&#65292;&#36996;&#24471;&#20877;&#20184;&#20986;&#19968;&#22823;&#31558;&#21644;&#35299;&#37329;&#65292;&#20184;&#19981;&#20184;&#24471;&#20986;&#20358;&#36996;&#19981;&#30693;&#36947;&#21602;&#65292;&#36889;&#27171;&#20197;&#24460;&#30340;&#26085;&#23376;&#35201;&#24590;&#40636;&#36942;&#19979;&#21435;&#21834;!</p><p>&#36889;&#20491;&#19981;&#22816;&#30340;&#24046;&#38989;&#37096;&#20221;&#65292;&#23601;&#26377;&#24517;&#35201;&#36023;&#20491;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#20358;&#35036;&#24375;&#12290;&#25343;&#25105;&#33258;&#24049;&#20316;&#20363;&#23376;&#65292;&#25105;&#26159;&#25235;&#19968;&#20154; 450 &#33836;&#24038;&#21491;&#65292;&#25187;&#25481;&#24375;&#21046;&#38570;&#30340; 150 &#33836;&#65292;&#36996;&#35201;&#20877;&#35036; 300 &#33836;&#30340;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#25165;&#22816;&#12290;&#30070;&#28982;&#65292;&#35201;&#25235;&#22810;&#23569;&#30475;&#20491;&#20154;&#65292;&#21443;&#32771;&#19968;&#19979;&#22577;&#32025;&#19978;&#30340;&#32113;&#35336;&#25976;&#25818;&#25033;&#35442;&#23601;&#21487;&#20197;&#26377;&#20491;&#24213;&#20102;&#12290;</p><p>&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;&#20854;&#26412;&#19978;&#26371;&#20998;&#28858;&#20841;&#20491;&#37096;&#20221;&#65292;&#19968;&#20491;&#26159;&#20154;&#39636;&#30340;&#37096;&#20221;&#65292;&#21478;&#19968;&#20491;&#21063;&#26159;&#36001;&#29289;&#30340;&#37096;&#20221;&#65292;&#26159;&#20998;&#38283;&#20445;&#30340;&#12290;&#36889;&#20491;&#38570;&#30340;&#30446;&#30340;&#24456;&#31777;&#21934;&#65292;&#22522;&#26412;&#19978;&#23601;&#26159;&#29992;&#22312;&#36064;&#20767;&#23565;&#26041; (&#25152;&#20197;&#21483;&#31532;&#19977;&#20154;)&#65292;&#26377;&#21487;&#33021;&#26159;&#23565;&#26041;&#21463;&#20663;&#12289;&#27544;&#24290;&#12289;&#29978;&#33267;&#27515;&#20129;&#65292;&#21478;&#19968;&#26041;&#38754;&#21063;&#26159;&#23565;&#26041;&#36001;&#29289;&#19978;&#30340;&#25613;&#22833;&#12290;</p><p>&#36889;&#37096;&#20221;&#30340;&#20445;&#36027;&#30495;&#30340;&#19981;&#36020; (&#22914;&#26524;&#36319;&#30002;&#24335;&#12289;&#20057;&#24335;&#27604;&#30340;&#35441;)&#65292;&#20197;&#25105;&#22312;&#24190;&#23478;&#20445;&#38570;&#20844;&#21496;&#35426;&#20729;&#30340;&#32080;&#26524;&#20358;&#30475;&#65292;100/200[&#8311;] &#22823;&#27010;&#21482;&#35201; 500 &#20803;&#65292;&#25552;&#39640;&#21040; 200/400 &#20063;&#19981;&#36942; 850 &#20803;&#24038;&#21491;&#65292;&#20877;&#21152;&#21040; 300/600 &#22823;&#27010;&#20063;&#25165; 1000 &#20803;&#20986;&#38957;&#12290;&#33267;&#26044;&#36001;&#25613;&#30340;&#37096;&#20221; 30 &#33836;&#22823;&#32004;&#26159; 1100 &#20803;&#65292;&#25552;&#39640;&#21040; 50 &#33836;&#21063;&#26159; 1400 &#20803;&#24038;&#21491;&#12290;&#25563;&#21477;&#35441;&#35498;&#65292;&#27599;&#24180;&#21482;&#35201;&#22810;&#33457;&#20491;&#22823;&#27010; 2000 &#20803;&#24038;&#21491; (&#38500;&#20102;&#24375;&#21046;&#38570;&#20043;&#22806;&#20877;&#21152;&#20445;&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;)&#65292;&#23601;&#21487;&#20197;&#29554;&#24471; &#8220;&#26356;&#36275;&#22816;&#8221; &#30340;&#20445;&#38556;&#12290;</p><p>&#24456;&#22810;&#20154;&#39000;&#24847;&#19968;&#24180;&#33457;&#22909;&#24190;&#33836;&#20445;&#36554;&#39636;&#38570; (&#28982;&#24460;&#22312;&#21040;&#26399;&#21069;&#20358;&#25563;&#20491;&#20840;&#36554;&#28900;&#28422;)&#65292;&#21371;&#19981;&#39000;&#33457;&#20491;&#24190;&#21315;&#22602;&#22810;&#36023;&#19968;&#40670;&#36012;&#20219;&#38570;&#65292;&#25105;&#30495;&#30340;&#19981;&#25026;&#65292;2000 &#20803;&#30495;&#30340;&#24456;&#36020;&#21966;? &#24847;&#22806;&#22043;&#65292;&#30495;&#30340;&#26159;&#19981;&#21487;&#38928;&#30693;&#30340;&#65292;&#25105;&#27599;&#27425;&#22312;&#36335;&#19978;&#30475;&#21040;&#19968;&#20123;&#19981;&#30693;&#36947;&#38283;&#36554;&#22312;&#24819;&#20160;&#40636;&#30340;&#20154;&#65292;&#23601;&#30495;&#30340;&#35258;&#24471;&#23601;&#31639;&#33258;&#24049;&#20877;&#24590;&#40636;&#23567;&#24515;&#65292;&#20063;&#24456;&#38627;&#20445;&#35657;&#21738;&#19968;&#22825;&#19981;&#26371;&#36939;&#27683;&#19981;&#22909;&#30896;&#21040;&#19968;&#20491;&#30333;&#30446;&#30340;&#20154;&#12290;</p><p>&#35352;&#24471;&#65292;&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;&#19968;&#23450;&#35201;&#36023;&#65292;&#32780;&#19988;&#35201;&#36023;&#22816;!</p><h4>&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570; (&#22810;&#20493;&#22411;)</h4><p>&#31532;&#19977;&#20154;&#22806;&#22806;&#36012;&#20219;&#38570;&#39636;&#20663;&#30340;&#37096;&#20221;&#65292;&#36890;&#24120;&#26159;&#25505;&#29992; 100/200 &#30340;&#22411;&#24335; (&#25110;&#33879; 200/400&#12289;300/600&#65292;&#30475;&#36023;&#22810;&#23569;)&#65292;&#24847;&#24605;&#26159;&#35498;&#27599;&#19968;&#20301;&#21463;&#23475;&#20154;&#26368;&#39640;&#29702;&#36064; 100 &#33836;&#65292;&#32780;&#27599;&#19968;&#20214;&#20107;&#25925;&#26368;&#39640;&#29702;&#36064;&#21040; 200 &#33836;&#12290;&#36889;&#31278;&#38617;&#20493;&#22411;&#30340;&#24456;&#23481;&#26131;&#26371;&#26377;&#20445;&#38556;&#19981;&#36275;&#30340;&#24773;&#24418;&#30332;&#29983;&#65292;&#27604;&#22914;&#35498;&#23565;&#26041;&#36554;&#19978;&#32317;&#20849;&#26377;&#19977;&#20491;&#20154;&#65292;&#27599;&#20491;&#20154;&#21028;&#36064; 85 &#33836; (&#32317;&#20849;&#35201;&#36064; 255 &#33836;)&#65292;&#36889;&#19981;&#22816;&#30340; 55 &#33836;&#23601;&#24471;&#33258;&#24049;&#38989;&#22806;&#20877;&#20184;&#12290;&#29694;&#22312; CRV &#12289;&#20241;&#26053;&#36554;&#36889;&#40636;&#22810;&#65292;&#33836;&#19968;&#30896;&#21040;&#23565;&#26041;&#30340;&#36554;&#19978;&#26377;&#22235;&#12289;&#20116;&#20491;&#20154;&#37027;&#24590;&#40636;&#36774;?</p><p>&#36889;&#24190;&#24180;&#20445;&#38570;&#20844;&#21496;&#26377;&#25512;&#20986;&#26032;&#30340; &#8220;&#22810;&#20493;&#22411;&#8221; &#30340;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#65292;&#21487;&#20197;&#25226;&#21934;&#19968;&#20107;&#25925;&#30340;&#38480;&#38989;&#25552;&#39640;&#65292;&#20063;&#35377;&#26159;&#19977;&#20493;&#12289;&#20116;&#20493;&#12289;&#29978;&#33267;&#26159;&#21313;&#20493;&#12290;&#25105;&#33258;&#24049;&#26159;&#36984;&#25799;&#20116;&#20493; (&#25105;&#30340;&#36554;&#35201;&#25758;&#21040;&#21487;&#20197;&#22352;&#36229;&#36942;&#20116;&#20154;&#12289;&#32780;&#19988;&#36996;&#25226;&#20154;&#23478;&#25758;&#30340;&#24456;&#22196;&#37325;&#65292;&#27231;&#29575;&#25033;&#35442;&#19981;&#22823;)&#65292;&#20063;&#23601;&#26159; 300/1500&#65292;&#36889;&#27171;&#23376;&#33267;&#23569;&#21487;&#20197;&#20445;&#38556;&#21040;&#33267;&#23569;&#20116;&#20491;&#20154;&#30340;&#29376;&#27841;&#12290;&#32893;&#36215;&#20358;&#24456;&#19981;&#37679;&#65292;&#20063;&#36889;&#20320;&#26371;&#35258;&#24471;&#36889;&#31278;&#22810;&#20493;&#22411;&#30340;&#20445;&#36027;&#19968;&#23450;&#26371;&#36020;&#24456;&#22810;&#21543;!? &#19981;&#36942;&#26368;&#35731;&#20154;&#35357;&#30064;&#30340;&#26159;&#65292;&#20197;&#25105;&#20445;&#30340;&#20116;&#20493;&#32780;&#35328;&#65292;&#23621;&#28982;&#21482;&#27604;&#38617;&#20493;&#22411;&#30340;&#22810;&#19981;&#21040;&#19968;&#30334;&#20803;&#32822;!</p><p>&#35352;&#24471;&#65292;&#21839;&#19968;&#19979;&#20320;&#30340;&#20445;&#38570;&#20844;&#21496;&#26377;&#27794;&#26377;&#36889;&#31278;&#21830;&#21697;&#65292;&#27794;&#26377;&#23601;&#25563;&#19968;&#23478;&#21543;![&#8312;]</p><h4>&#36229;&#38989;&#36012;&#20219;&#38570;</h4><p>&#36889;&#20063;&#26159;&#19968;&#31278;&#26032;&#30340;&#21830;&#21697;&#65292;&#22914;&#26524;&#35258;&#24471;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#30340;&#20445;&#38556;&#36996;&#26159;&#19981;&#22816;&#65292;&#21487;&#20197;&#20877;&#21152;&#20445;&#36889;&#31278;&#38570; (&#36890;&#24120;&#26159;&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;&#30340;&#38468;&#32004;)&#12290;&#36229;&#38989;&#36012;&#20219;&#38570;&#30340;&#29992;&#36884;&#26159;&#65292;&#30070;&#24375;&#21046;&#21152;&#31532;&#19977;&#20154;&#21512;&#36215;&#20358;&#20063;&#19981;&#22816;&#36064;&#30340;&#26178;&#20505;&#65292;&#21487;&#20197;&#26377;&#20877;&#22810;&#19968;&#23652;&#30340;&#20445;&#38556;&#12290;&#24456;&#22909;&#30340;&#19968;&#20491;&#20363;&#23376;&#23601;&#26159;&#21435;&#25758;&#21040;&#24190;&#30334;&#33836;&#30340;&#21517;&#36554;&#65292;&#20809;&#36064;&#20154;&#23478;&#20462;&#36554;&#36027;&#21487;&#33021;&#23601;&#25226;&#36001;&#25613;&#30340;&#37096;&#20221;&#29992;&#20809;&#20102;&#65292;&#36889;&#20491;&#26178;&#20505;&#23601;&#21487;&#20197;&#29992;&#36229;&#38989;&#36012;&#20219;&#38570;&#36889;&#37096;&#20221;&#30340;&#20445;&#38989;&#20358;&#36064;&#65292;&#22810;&#19968;&#23652;&#20445;&#38556;&#32317;&#26159;&#22909;&#20107; (&#27841;&#19988;&#22312;&#21488;&#28771;&#30334;&#33836;&#21517;&#36554;&#21487;&#26159;&#28415;&#34903;&#36305;)&#12290;&#36889;&#31278;&#21830;&#21697;&#36996;&#19981;&#31639;&#22826;&#36020;&#65292;500 &#33836;&#30340;&#20445;&#38989;&#22823;&#27010;&#24555; 2000 &#20803;&#65292;&#19981;&#36942;&#32085;&#23565;&#27604;&#31532;&#19977;&#20154;&#36012;&#20219;&#36001;&#25613;&#37096;&#20221;&#30340;&#20445;&#36027;&#35201;&#20415;&#23452; (&#20197;&#30456;&#21516;&#20445;&#38989;&#32780;&#35328;)&#65292;&#24314;&#35696;&#37027;&#37002;&#19981;&#29992;&#36023;&#22826;&#22810; (&#27604;&#22914;&#35498; 30 &#33836;)&#65292;&#29992;&#36889;&#35041;&#30340;&#20358;&#35036;&#26371;&#30465;&#19968;&#40670;&#12290;</p><p>&#20197;&#25105;&#33258;&#24049;&#32780;&#35328;&#65292;&#25105;&#36023;&#20102; 300/1500 (&#20154;) + 50 (&#29289;) &#30340;&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;&#65292;&#20877;&#21152;&#20445;&#19968;&#20491; 500 &#33836;&#30340;&#36229;&#38989;&#36012;&#20219;&#38570;&#65292;&#36889;&#27171;&#30340;&#20445;&#38556;&#25033;&#35442;&#26159;&#36275;&#22816;&#35731;&#20154;&#23433;&#24515;&#30340; (&#30070;&#28982;&#19981;&#26159;&#23433;&#24515;&#20098;&#38283;)&#12290;</p><h4>&#39381;&#39387;&#20154;&#20663;&#23475;&#38570;&#12289;&#20056;&#23458;&#36012;&#20219;&#38570;</h4><p>&#31532;&#19977;&#20154;&#36012;&#20219;&#38570;&#21482;&#20445;&#38556;&#23565;&#26041; (&#20063;&#23601;&#26159;&#38500;&#20102;&#22352;&#22312;&#25105;&#20497;&#33258;&#24049;&#36554;&#23376;&#20839;&#38754;&#20197;&#22806;&#30340;&#25152;&#26377;&#20154;)&#65292;&#37027;&#22352;&#22312;&#25105;&#20497;&#36554;&#19978;&#30340;&#20154;&#24590;&#40636;&#36774;? &#36996;&#26377;&#65292;&#38283;&#36554;&#30340;&#39381;&#39387;&#24590;&#40636;&#36774;?</p><p>&#30070;&#28982;&#65292;&#36996;&#26377;&#24375;&#21046;&#36012;&#20219;&#38570;&#65292;&#24375;&#21046;&#38570;&#30340;&#20445;&#38556;&#21253;&#21547;&#20102;&#25152;&#26377;&#20154; (&#39381;&#39387;&#21450;&#20056;&#23458;&#30070;&#28982;&#20063;&#31639;)&#65292;&#20294;&#26159;&#22914;&#26524;&#25812;&#24515;&#24375;&#21046;&#38570;&#30340;&#20445;&#38556;&#19981;&#22816;&#65292;&#21487;&#20197;&#21152;&#36023;&#36889;&#20841;&#31278;&#38570;&#20358;&#20316;&#35036;&#24375; (&#19981;&#35201;&#21839;&#25105;&#28858;&#20160;&#40636;&#35201;&#24324;&#36889;&#40636;&#35079;&#38620;)&#12290;&#25105;&#24819;&#65292;&#26371;&#22352;&#22312;&#25105;&#20497;&#36554;&#35041;&#30340;&#19981;&#26159;&#35242;&#20154;&#23601;&#26159;&#22909;&#21451;&#65292;&#33836;&#19968;&#35731;&#20154;&#21463;&#20102;&#20663;&#26412;&#20358;&#23601;&#24050;&#32147;&#24456;&#19981;&#22909;&#24847;&#24605;&#20102;&#65292;&#20986;&#22810;&#19968;&#40670;&#37291;&#34277;&#36027;&#12289;&#24944;&#21839;&#37329;&#20063;&#26159;&#25033;&#35442;&#30340;&#12290;&#30070;&#28982;&#65292;&#22914;&#26524;&#20320;&#30340;&#36554;&#23376;&#24478;&#19981;&#36617;&#20154;&#65292;&#37027;&#36889;&#37096;&#20221;&#23601;&#21487;&#20197;&#30465;&#19979;&#20358;&#20102;&#12290;</p><p>&#36889;&#20445;&#36027;&#20063;&#19981;&#36020;&#65292;&#22823;&#27010; 1500 &#24038;&#21491;&#23601;&#24046;&#19981;&#22810;&#20102;&#12290;</p><h4>&#36554;&#39636;&#38570;</h4><p>&#36889;&#20491;&#23601;&#26159;&#22823;&#23478;&#37117;&#24456;&#29087;&#24713;&#30340;&#37027;&#20491;&#30002;&#24335;&#12289;&#20057;&#24335;&#12289;&#19993;&#24335;&#65292;&#20057;&#24335;&#27604;&#19993;&#24335;&#36020;&#65292;&#30002;&#24335;&#21448;&#27604;&#20057;&#24335;&#36020; (&#32780;&#19988;&#36020;&#21040;&#19981;&#20687;&#35441;)&#65292;&#22823;&#27010;&#23601;&#26159;&#36889;&#27171;&nbsp;&#8230; &#19977;&#32773;&#30340;&#24046;&#21029;&#20027;&#35201;&#22312;&#20445;&#38556;&#30340;&#31684;&#22285;&#65292;&#30002;&#24335;&#24190;&#20046;&#20160;&#40636;&#37117;&#36064; (&#20063;&#23601;&#26159;&#36554;&#23376;&#26377;&#25613;&#20663;&#23601;&#36064;)&#65292;&#19993;&#24335;&#21482;&#26377;&#22312;&#36554;&#25758;&#36554;&#30340;&#24773;&#27841;&#25165;&#36064; (&#25563;&#21477;&#35441;&#25758;&#21040;&#38651;&#32218;&#26438;&#25110;&#26159;&#25758;&#21040;&#29275;&#19981;&#36064;)&#65292;&#20057;&#24335;&#21063;&#20171;&#26044;&#20841;&#32773;&#20013;&#38291; (&#27604;&#19993;&#24335;&#22810;&#20102;&#28779;&#28797;&#12289;&#38647;&#25802;&#12289;&#22684;&#33853;&#29289;&#31561;)&#12290;</p><p>&#36554;&#39636;&#38570;&#20445;&#36027;&#39640;&#20302;[&#8313;]&#36319;&#36554;&#23376; (&#24288;&#29260;&#12289;&#22411;&#34399;&#12289;&#24180;&#20221;) &#36996;&#26377;&#36554;&#20027;&#24615;&#21029;&#12289;&#24180;&#40801;&#21450;&#39381;&#39387;&#32000;&#37636;&#37117;&#26377;&#38364;&#20418;&#65292;&#36319;&#24375;&#21046;&#38570;&#19968;&#27171;&#65292;&#22914;&#26524;&#27794;&#26377;&#20986;&#38570;&#32000;&#37636;&#20063;&#21487;&#20197;&#26377;&#25240;&#25187;&#65292;&#36889;&#37002;&#22823;&#27010;&#35611;&#19968;&#19979;&#23427;&#26159;&#24590;&#40636;&#35336;&#31639;&#30340;&#12290;&#36554;&#23376;&#36996;&#26377;&#36554;&#20027;&#30340;&#37096;&#20221;&#26159;&#27794;&#26377;&#25240;&#25187;&#30340;&#65292;&#19968;&#20491;&#34920;&#26684;&#23450;&#22909;&#65292;&#26159;&#20160;&#40636;&#36554;&#23601;&#26159;&#20160;&#40636;&#20418;&#25976;&#65292;&#36554;&#20027;&#24190;&#27506;&#12289;&#26159;&#30007;&#36996;&#26159;&#22899;&#23601;&#26159;&#22266;&#23450;&#30340;&#20418;&#25976;&#65292;&#36889;&#27794;&#20160;&#40636;&#22909;&#35498;&#30340;&#12290;&#26377;&#25240;&#25187;&#30340;&#26159;&#28961;&#32903;&#20107;&#28187;&#36027;&#30340;&#37096;&#20221;&#12290;&#39318;&#20808;&#23427;&#26371;&#30475;&#19977;&#24180;&#20839;&#26377;&#28961;&#20986;&#38570;&#32000;&#37636;&#65292;&#22914;&#26524;&#26377;&#19968;&#27425;&#23601;&#19981;&#25171;&#25240;&#65292;&#26377;&#20841;&#27425;&#20445;&#36027;&#35201;&#21152; 20%&#65292;&#26377;&#19977;&#27425;&#21152; 40%&#65292;&#26377;&#22235;&#27425;&#21152; 60%&#65292;&#20197;&#27492;&#39006;&#25512; (&#28961;&#19978;&#38480;)&#12290;&#21478;&#19968;&#26041;&#38754;&#23427;&#36996;&#26371;&#20877;&#20381;&#29031;&#36899;&#32396;&#28961;&#20986;&#38570;&#24180;&#25976;&#20358;&#25171;&#25240;&#65292;&#19968;&#24180;&#28961;&#20986;&#38570;&#32000;&#37636;&#25187; 20%&#65292;&#36899;&#32396;&#20841;&#24180;&#21063;&#25187; 40%&#65292;&#19977;&#24180;&#25187; 60% (&#26368;&#39640;&#21482;&#31639;&#21040;&#19977;&#24180;)&#12290;&#25226;&#36889;&#20841;&#37096;&#20221;&#21152;&#36215;&#20358;&#23601;&#26159;&#26368;&#24460;&#30340;&#28961;&#32903;&#20107;&#28187;&#36027;&#20418;&#25976;&#12290;</p><p>&#33289;&#20491;&#20363;&#23376;&#22909;&#20102;&#65292;&#20551;&#35373;&#25105;&#21069;&#24180;&#36023;&#30340;&#26032;&#36554;&#65292;&#21435;&#24180;&#24213;&#34987;&#20445;&#38570;&#21729;&#39449;&#20102;&#20986;&#38570;&#19968;&#27425;&#20316;&#32654;&#23481; (&#35442;&#27515;)&#65292;&#20170;&#24180;&#23416;&#20054;&#20102;&#19981;&#20986;&#38570;&#65292;&#37027;&#26126;&#24180;&#30340;&#28961;&#32903;&#20107;&#28187;&#36027;&#20418;&#25976;&#23601;&#26159; 0% (&#20986;&#38570;&#36942;&#19968;&#27425;)&#8202;&#8212;&#8202;20% (&#19968;&#24180;&#20839;&#28961;&#20986;&#38570;) = -20%&#65292;&#25152;&#20197;&#26159;&#25171;&#20843;&#25240;&#12290;&#20294;&#26159;&#22914;&#26524;&#36939;&#27683;&#19981;&#22909; (&#19968;&#27171;&#21435;&#24180;&#34987;&#39449;&#20986;&#20102;&#19968;&#27425;&#38570;&#20316;&#32654;&#23481;)&#65292;&#20170;&#24180;&#22240;&#28858;&#24847;&#22806;&#21448;&#20986;&#20102;&#19968;&#27425;&#38570;&#65292;&#20418;&#25976;&#26371;&#35722;&#25104; 20% (&#20986;&#38570;&#36942;&#20108;&#27425;)&#8202;&#8212;&#8202;0% (&#27794;&#26377;&#36899;&#32396;&#24180;&#24230;&#28961;&#32903;&#20107;&#32000;&#37636;) = 20%&#65292;&#25152;&#20197;&#26159;&#35201;&#22810;&#20184; 20%! &#30456;&#21516;&#30340;&#24773;&#27841;&#65292;&#20294;&#26159;&#22914;&#26524;&#21435;&#24180;&#27794;&#26377;&#34987;&#39449;&#20986;&#38570;&#20316;&#32654;&#23481;&#65292;&#23601;&#31639;&#20170;&#24180;&#36939;&#27683;&#19981;&#22909;&#35201;&#20986;&#38570;&#65292;&#29694;&#22312;&#30340;&#20418;&#25976;&#20063;&#21482;&#26371;&#26159; 0% (&#20986;&#36942;&#19968;&#27425;&#38570;)&#8202;&#8212;&#8202;0% (&#27794;&#26377;&#36899;&#32396;&#24180;&#24230;&#28961;&#32903;&#20107;&#32000;&#37636;) = 0% &#32780;&#24050;&#65292;&#26377;&#24046;&#21734;&#65292;&#22914;&#26524;&#20445;&#30340;&#26159;&#30002;&#24335;&#30340;&#65292;&#37027;&#21487;&#33021;&#26371;&#24046;&#19978;&#22909;&#24190;&#33836;!</p><p>&#30475;&#36215;&#20358;&#23526;&#22312;&#26377;&#40670;&#35079;&#38620;&#65292;&#20854;&#23526;&#35498;&#31359;&#20102;&#30495;&#30340;&#24456;&#31777;&#21934;&#65292;&#21453;&#27491;&#27794;&#20107;&#21029;&#20098;&#20986;&#38570;&#23601;&#26159;&#20102; (&#22240;&#28858;&#21482;&#35201;&#26377;&#20986;&#38570;&#23601;&#19968;&#23450;&#26371;&#36020;&#21040;&#38548;&#24180;&#30340;&#20445;&#36027;)&#12290;&#19981;&#35201;&#20877;&#30456;&#20449;&#26989;&#20195;&#27794;&#26377;&#26681;&#25818;&#30340;&#35441;&#20102;&#65292;&#32650;&#27611;&#20986;&#22312;&#32650;&#36523;&#19978;!</p><h4>&#31434;&#30428;&#38570;</h4><p>&#31434;&#30428;&#38570;&#20445;&#36027;&#30340;&#31639;&#27861;&#65292;&#36319;&#36554;&#23376;&#26412;&#36523;&#30340;&#20729;&#20540;&#26377;&#38364;&#65292;&#32780;&#19988;&#36554;&#23376;&#30340;&#20729;&#20540;&#26159;&#35201;&#31639;&#25240;&#33290;&#30340;&#65292;&#27604;&#22914;&#35498;&#26032;&#36554; 100 &#33836;&#36023;&#20358;&#65292;&#22823;&#27010;&#19977;&#24180;&#20854;&#20729;&#20540;&#23601;&#26371;&#25240;&#21040; 50% &#24038;&#21491; (&#27599;&#20491;&#24288;&#29260;&#12289;&#36554;&#27454;&#25240;&#33290;&#36895;&#24230;&#37117;&#19981;&#22823;&#19968;&#27171;)&#12290;&#36890;&#24120;&#22823;&#23478;&#30340;&#35611;&#27861;&#26159;&#26032;&#36554;&#24190;&#24180;&#20839;&#25165;&#27604;&#36611;&#38656;&#35201;&#20445;&#31434;&#30428;&#38570;&#65292;&#31561;&#36554;&#23376;&#32769;&#20102;&#20063;&#27794;&#22810;&#23569;&#20729;&#20540;&#20102;&#65292;&#20445;&#20102;&#20063;&#35258;&#24471;&#28010;&#36027;&#12290;</p><p>&#36889;&#20491;&#25033;&#35442;&#19981;&#29992;&#22810;&#35611;&#20102;&#65292;&#24597;&#36554;&#34987;&#29309;&#36208;&#23601;&#20445;&#21543;!</p><h4>&#36554;&#38570;&#30340;&#25240;&#25187;</h4><p>&#21069;&#38754;&#24050;&#32147;&#35611;&#36942;&#65292;&#21482;&#35201;&#20445;&#25345;&#27794;&#26377;&#32903;&#20107;&#30340;&#32000;&#37636;&#65292;&#24375;&#21046;&#36012;&#20219;&#38570;&#20197;&#21450;&#36554;&#39636;&#38570;&#37117;&#21487;&#20197;&#26377;&#25240;&#25187; (&#20854;&#23526;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#20063;&#26377;[10])&#65292;&#32780;&#19988;&#36889;&#20491;&#25240;&#25187;&#26159;&#20844;&#23450;&#30340;&#65292;&#22909;&#27604;&#35498;&#24375;&#21046;&#36012;&#20219;&#38570;&#36027;&#29575;&#26159;&#27861;&#24459;&#35215;&#23450;&#30340;&#65292;&#20445;&#38570;&#20844;&#21496;&#27794;&#24171;&#20320;&#25240;&#25481;&#21487;&#20197;&#21435;&#30003;&#35380;&#12290;&#38500;&#27492;&#20043;&#22806;&#65292;&#22914;&#26524;&#22266;&#23450;&#37117;&#36319;&#21516;&#19968;&#23478;&#20445;&#38570;&#20844;&#21496;&#25110;&#26159;&#26989;&#20195;&#20445;&#65292;&#21487;&#33021;&#36996;&#26377;&#38989;&#22806;&#30340;&#25240;&#25187; (&#20063;&#35377;&#26159;&#26989;&#20195;&#30340;&#20653;&#37329;&#37096;&#20221;&#22810;&#23569;&#32102;&#19968;&#40670;&#20778;&#24800;)&#65292;&#20687;&#25105;&#20497;&#30340;&#23601;&#26377;&#20877;&#25171;&#20061;&#25240;&#12290;</p><h4>&#39381;&#39387;&#26159;&#35504;&#26377;&#24046;</h4><p>&#36889;&#37096;&#20221;&#21482;&#26377;&#36554;&#39636;&#38570;&#38656;&#35201;&#32771;&#24942;&#65292;&#20854;&#20182;&#19981;&#31649;&#26159;&#24375;&#21046;&#36012;&#20219;&#38570;&#12289;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#12289;&#31434;&#30428;&#38570;&#65292;&#19981;&#35542;&#20986;&#20107;&#30340;&#26178;&#20505;&#38283;&#36554;&#30340;&#26159;&#35504;&#65292;&#37117;&#26377;&#29702;&#36064;&#12290;&#20294;&#26159;&#36554;&#39636;&#38570;&#26377;&#40670;&#19981;&#19968;&#27171;&#21734;&#65292;&#38500;&#38750;&#38283;&#36554;&#30340;&#26159;&#34987;&#20445;&#38570;&#20154; (&#36554;&#20027;&#25110;&#26159;&#26377;&#29305;&#21029;&#21015;&#21517;&#30340;&#20154;)&#65292;&#25110;&#26159;&#36554;&#20027;&#30340;&#37197;&#20598;&#12289;&#21516;&#23621;&#23478;&#23660;&#12289;&#22235;&#31561;&#35242;&#20197;&#20839;&#34880;&#35242;&#25110;&#26159;&#19977;&#31561;&#35242;&#20197;&#20839;&#23035;&#35242;&#65292;&#21542;&#21063;&#36554;&#39636;&#38570;&#26159;&#19981;&#26371;&#29702;&#36064;&#30340;&#12290;&#26368;&#24120;&#30475;&#21040;&#30340;&#24773;&#24418;&#23601;&#26159;&#30007;&#22899;&#26379;&#21451;&#65292;&#22914;&#26524;&#26159;&#21478;&#19968;&#21322;&#38283;&#36554;&#21435;&#25758;&#21040;&#65292;&#20445;&#38570;&#20844;&#21496;&#36996;&#26159;&#26371;&#36319;&#20182;(&#22905;)&#32034;&#36064;&#30340;&#65292;&#25152;&#20197;&#22312;&#36889;&#31278;&#24773;&#27841;&#19979;&#19968;&#23450;&#35201;&#35352;&#24471;&#25226;&#21478;&#19968;&#21322; (&#30007;&#22899;&#26379;&#21451;) &#29305;&#21029;&#21015;&#28858;&#34987;&#20445;&#38570;&#20154;&#12290;</p><p>&#25152;&#20197;&#21862;&#65292;&#36554;&#23376;&#19981;&#35201;&#38568;&#20415;&#20511;&#20154;&#38283;&#65292;&#19981;&#35201;&#20197;&#28858;&#36554;&#23376;&#26377;&#20445;&#38570;&#20102;&#23601;&#33836;&#20107; OK&nbsp;&#8230;</p><p>&#20854;&#23526;&#29694;&#22312;&#36996;&#26377;&#21478;&#19968;&#31278;&#29305;&#21029;&#30340; &#8220;&#38480;&#23450;&#39381;&#39387;&#20154;&#8221; &#36554;&#39636;&#38570;&#65292;&#20445;&#36027;&#27604;&#36611;&#20415;&#23452; (&#22823;&#27010;&#26159;&#27491;&#24120;&#30340; 65%)&#65292;&#23427;&#35215;&#23450;&#39381;&#39387;&#20154;&#24517;&#38920;&#26159;&#36554;&#20027;&#25110;&#26159;&#20854;&#37197;&#20598;&#65292;&#21542;&#21063;&#19968;&#24459;&#19981;&#20104;&#29702;&#36064;&#12290;&#36554;&#39636;&#38570;&#24456;&#36020;&#65292;&#22914;&#26524;&#20320;&#30340;&#24859;&#36554;&#19981;&#26371;&#20511;&#32102;&#21029;&#20154;&#12289;&#37117;&#26159;&#33258;&#24049;&#22827;&#23142;&#20486;&#22312;&#38283;&#65292;&#36889;&#37096;&#20221;&#30340;&#37666;&#20498;&#26159;&#21487;&#20197;&#30465;&#19979;&#20358; &#12290;</p><h4>&#40643;&#37329;&#32068;&#21512;</h4><p>&#22909;&#21543;&#65292;&#38283;&#36554;&#24456;&#32047;&#65292;&#22823;&#23478;&#37117;&#27604;&#29275;&#20180;&#36996;&#32047;&#65292;&#26377;&#26178;&#38291;&#36996;&#26159;&#22810;&#24171;&#33258;&#24049;&#21450;&#23478;&#20154;&#24819;&#24819;&#65292;&#36554;&#38570;&#26159;&#19981;&#26159;&#26377;&#36948;&#21040;&#33258;&#24049;&#24515;&#30446;&#20013;&#30340;&#40643;&#37329;&#32068;&#21512;:</p><ul><li><p>&#24375;&#21046;&#36012;&#20219;&#38570;&#35352;&#24471;&#35201;&#36023;&#65292;&#20813;&#24471;&#34987;&#32624;&#37666;&#12290;</p></li><li><p>&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#19968;&#23450;&#35201;&#21152;&#65292;&#39636;&#20663;&#30340;&#37096;&#20221;&#33267;&#23569;&#21152;&#21040; 200 &#33836;&#65292;&#19981;&#36942;&#26368;&#22909;&#21487;&#20197;&#21040; 300~400 &#33836;&#20197;&#19978;&#65292;&#36001;&#25613;&#30340;&#37096;&#20221;&#33267;&#23569; 30 &#33836;&#12290;&#35352;&#24471;&#36984;&#25799;&#22810;&#20493;&#22411;&#30340;&#65292;&#33267;&#23569;&#35201;&#20116;&#20493;&#12290;&#21152;&#20491;&#36229;&#38989;&#36012;&#20219;&#38570; 300~500 &#33836;&#12290;&#21152;&#20491;&#20056;&#23458;&#36012;&#20219;&#38570;&#65292;&#38989;&#24230;&#33267;&#23569; 10 &#33836;&#21543;&#12290;&#22914;&#26524;&#33258;&#24049;&#26412;&#20358;&#27794;&#20445;&#24847;&#22806;&#38570; (&#21769;&#21568;&#65292;&#24590;&#40636;&#26371;&#27794;&#20445;)&#65292;&#23601;&#20877;&#21152;&#20491;&#39381;&#39387;&#20663;&#23475;&#38570;&#12290;</p></li><li><p>&#36554;&#39636;&#38570;&#35201;&#30475;&#20154;&#21543;&#65292;&#22914;&#26524;&#35258;&#24471;&#33258;&#24049;&#38283;&#36554;&#37117;&#24456;&#23567;&#24515;&#65292;&#20063;&#24456;&#35641;&#24910;&#65292;&#24478;&#20358;&#20063;&#27794;&#20986;&#36942;&#20160;&#40636;&#22823;&#20107;&#65292;&#20063;&#35377;&#19981;&#20445;&#20063;&#21487;&#20197;&#65292;&#22240;&#28858;&#23601;&#31639;&#20986;&#20102;&#20107;&#20102;&#19981;&#36215;&#23601;&#33258;&#24049;&#20184;&#37666;&#20462;&#36554;&#65292;&#27794;&#21861;&#22823;&#19981;&#20102; (&#36889;&#36319;&#35201;&#36064;&#20154;&#23478;&#30340;&#37096;&#20221;&#26159;&#19981;&#19968;&#27171;&#30340;)&#12290;&#20294;&#35201;&#26159;&#20320;&#35222;&#36554;&#22914;&#24859;&#22971;&#65292;&#38283;&#36554;&#21448;&#24120;&#24120;&#20986;&#20107; (&#36889;&#26377;&#40670;&#30683;&#30462;?)&#65292;&#38283;&#30340;&#26159;&#30334;&#33836;&#21517;&#36554;&#65292;&#20445;&#20491;&#30002;&#24335;&#30340;&#21487;&#33021;&#26371;&#27604;&#36611;&#23433;&#24515;&#19968;&#40670;&#12290;</p></li><li><p>&#31434;&#30428;&#38570;&#65292;&#22914;&#26524;&#26159;&#19977;&#24180;&#20839;&#26032;&#36554;&#23601;&#36023;&#21543;&#65292;&#24190;&#21315;&#22602;&#23565;&#36077;&#24190;&#21313;&#33836; (&#22283;&#29986;&#36554;) &#20063;&#36996;&#21010;&#31639; (&#19968;&#20999;&#37117;&#26159;&#33836;&#19968;&#21834;)&#12290;&#22914;&#26524;&#26377;&#33258;&#24049;&#30340;&#36554;&#24235;&#65292;&#21448;&#24456;&#23569;&#20572;&#22312;&#22806;&#38754;&#65292;&#25110;&#26159;&#36554;&#23376;&#38283;&#22909;&#24190;&#24180;&#20102;&#26089;&#23601;&#25240;&#33290;&#21040;&#21097;&#27794;&#24190;&#33836;&#22602;&#65292;&#36889;&#37096;&#20221;&#20063;&#35377;&#23601;&#19981;&#22826;&#38656;&#35201;&#12290;</p></li></ul><p>[&#185;]: <a href="https://www.cathay-ins.com.tw/page2/2.htm#4">&#26410;&#25237;&#20445;&#27773;&#65288;&#27231;&#65289;&#36554;&#24375;&#21046;&#36012;&#20219;&#20445;&#38570;&#20013;&#65292;&#26371;&#36973;&#21463;&#22914;&#20309;&#20043;&#34389;&#32624;?</a></p><p>[&#178;]: <a href="https://www.cathay-ins.com.tw/page2/2.htm#1">&#24375;&#21046;&#38570;&#25552;&#20379;&#22914;&#20309;&#30340;&#20445;&#38556;?</a></p><p>[&#179;]: <a href="https://www.cathay-ins.com.tw/page2/2.htm#9">&#19968;&#36554;&#36635;&#20107;&#25925;</a></p><p>[&#8308;]: <a href="https://www.cathay-ins.com.tw/page2/2-0-0-1.htm#3">&#37027;&#20123;&#24773;&#24418;&#19990;&#32000;&#29986;&#29289;&#20445;&#38570;&#20844;&#21496;&#36064;&#20767;&#24460;&#65292;&#26371;&#23565;&#32903;&#20107;&#21152;&#23475;&#20154;&#36914;&#34892;&#36861;&#20767;?</a></p><p>[&#8309;]: &#24375;&#21046;&#27773;&#36554;&#36012;&#20219;&#20445;&#38570;&#36027;&#29575;&#34920;, 95&#24180;&#24375;&#21046;&#27773;&#36554;&#36012;&#20219;&#20445;&#38570;&#36027;&#29575;&#34920;</p><p>[&#8310;]: <a href="http://www.unionins.com.tw/Union_insurancewhatnew_detail.asp?iwn_rfnbr=42">&#36554;&#31117;&#24944;&#21839;&#37329;&#65292;&#32005;&#21253;&#38570;&#65292;&#29702;&#36064;&#26368;&#22810;5&#33836;&#20803;</a></p><p>[&#8311;]: 100/200 &#26159;&#25351;&#26368;&#39640;&#27599;&#19968;&#20301;&#21463;&#23475;&#20154;&#21487;&#20197;&#29554;&#24471; 100 &#33836;&#30340;&#36064;&#20767;&#65292;&#27599;&#19968;&#20214;&#20107;&#25925;&#26368;&#39640;&#19978;&#38480; 200 &#33836; (&#19981;&#31649;&#24190;&#20154;)&#12290;</p><p>[&#8312;]: &#20320;&#20063;&#35377;&#26371;&#35258;&#24471;&#22855;&#24618;&#65292;&#28858;&#20160;&#40636;&#19981;&#30452;&#25509;&#25226;&#38617;&#20493;&#22411;&#30340;&#30452;&#25509;&#36023;&#21040; 750/1500 &#23601;&#22909;? &#22240;&#28858;&#37027;&#27171;&#30340;&#20445;&#36027;&#26371;&#27604;&#36023;&#20116;&#20493;&#22411;&#30340; 300/1500 &#35201;&#36020;&#19978;&#24456;&#22810; (&#33267;&#23569;&#19981;&#26159;&#24046;&#20491;&#24190;&#30334;&#22602;&#32780;&#24050;&#65292;&#19968;&#23450;&#26159;&#22909;&#24190;&#20493;&#20197;&#19978;)&#12290;</p><p>[&#8313;]: <a href="http://5i01.com/topicdetail.php?f=294&amp;t=427579">&#36554;&#39636;&#38570;&#20445;&#36027;&#31639;&#27861;</a></p><p>[&#185;&#8304;]: <a href="http://www.honda.club.tw/viewthread.php?tid=3951">&#36554;&#39636;&#38570;&#12289;&#24375;&#21046;&#38570;&#12289;&#31532;&#19977;&#20154;&#24847;&#22806;&#36012;&#20219;&#38570;&#20445;&#36027;&#31639;&#27861;</a></p>]]></content:encoded></item><item><title><![CDATA[要不要保醫療險?]]></title><description><![CDATA[&#19981;&#30693;&#36947;&#28858;&#20160;&#40636;&#65292;&#26368;&#36817;&#19968;&#30452;&#26377;&#26379;&#21451;&#36319;&#25105;&#32842;&#21040;&#37291;&#30274;&#38570;&#30340;&#20107; &#8230;]]></description><link>https://www.bryantsai.com/p/medical-insurance</link><guid isPermaLink="false">https://www.bryantsai.com/p/medical-insurance</guid><dc:creator><![CDATA[Bryan Tsai]]></dc:creator><pubDate>Fri, 27 Jul 2007 07:00:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t3wB!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5bb1d8e-eb72-4ee6-934d-b03497589215_144x144.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#19981;&#30693;&#36947;&#28858;&#20160;&#40636;&#65292;&#26368;&#36817;&#19968;&#30452;&#26377;&#26379;&#21451;&#36319;&#25105;&#32842;&#21040;&#37291;&#30274;&#38570;&#30340;&#20107;&nbsp;&#8230;</p><p>&#26159;&#30340;&#65292;&#25105;&#27794;&#26377;&#20445;&#20219;&#20309;&#37291;&#30274;&#38570;&#65292;&#19981;&#36942;&#28858;&#20102;&#19981;&#24819;&#19968;&#20877;&#35299;&#37323;&#25105;&#30340;&#29702;&#30001;&#65292;&#25105;&#27770;&#23450;&#20358;&#22909;&#22909;&#35430;&#31639;&#19968;&#19979;&#65292;&#36889;&#27171;&#23376;&#20197;&#24460;&#20877;&#36935;&#21040;&#26377;&#20154;&#21839;&#36215;&#65292;&#30452;&#25509;&#21483;&#20182;&#20358;&#30475;&#36889;&#31687;&#23601;&#34892;&#20102;!</p><p>&#35201;&#22238;&#31572;&#36889;&#20491;&#21839;&#38988;&#65292;&#20854;&#23526;&#36996;&#30495;&#19981;&#23481;&#26131; (&#35201;&#19981;&#28982;&#20445;&#38570;&#20844;&#21496;&#23601;&#19981;&#26371;&#36889;&#40636;&#36090;&#37666;&#20102;)&#65292;&#25152;&#20197;&#36889;&#31687;&#25991;&#31456;&#26371;&#26377;&#40670;&#38263;&#65292;&#20294;&#26159;&#27794;&#26377;&#32784;&#24515;&#30475;&#23436;&#30340;&#20154;&#21487;&#20197;&#20808;&#20599;&#30475;&#19968;&#19979;&#25105;&#30340;&#32080;&#35542;:</p><p>* &#32085;&#23565;&#19981;&#35201;&#36023;&#32066;&#36523;&#37291;&#30274;&#38570;&#65292;&#19981;&#31649;&#26159;&#28961;&#19978;&#38480;&#36996;&#26159;&#26377;&#19978;&#38480;&#30340;&#12290;<br>* &#19981;&#36942;&#65292;&#26368;&#22909;&#36023;&#20491;&#23450;&#26399;&#37291;&#30274;&#38570;&#12290;</p><p>&#36889;&#23601;&#26159;&#25105;&#30340;&#26368;&#32066;&#32080;&#35542;&#65292;&#19981;&#36942;&#24213;&#19979;&#36996;&#26377;&#24456;&#22810;&#20294;&#26360;&#21450;&#20998;&#26512;&#65292;&#24314;&#35696;&#22823;&#23478;&#36996;&#26159;&#22909;&#22909;&#30475;&#19979;&#21435;&#65292;&#20813;&#24471;&#20316;&#20102;&#27770;&#23450;&#20197;&#24460;&#36996;&#36305;&#20358;&#24618;&#25105;</p><p>&#37027;&#40636;&#65292;&#32066;&#36523;&#37291;&#30274;&#38570;&#21040;&#24213;&#26377;&#20160;&#40636;&#19981;&#22909;? (&#26082;&#28982;&#25105;&#35498;&#19981;&#35201;&#36023;)</p><p>&#25152;&#35586;&#30340;&#32066;&#36523;&#65292;&#25351;&#30340;&#23601;&#26159;&#22905;&#30340;&#20445;&#38556;&#26159;&#19968;&#36649;&#23376;&#30340;&#65292;&#19968;&#33324;&#32780;&#35328;&#65292;&#32066;&#36523;&#37291;&#30274;&#38570;&#36996;&#21487;&#20197;&#20998;&#28858;&#20841;&#31278;&#65292;&#19968;&#31278;&#26159;&#27794;&#26377;&#32102;&#20184;&#19978;&#38480;&#30340; (&#28961;&#19978;&#38480;)&#65292;&#20063;&#23601;&#26159;&#35498;&#21482;&#35201;&#20320;&#31526;&#21512;&#29702;&#36064;&#30340;&#27161;&#28310;&#65292;&#20445;&#38570;&#20844;&#21496;&#23601;&#24471;&#19968;&#30452;&#20184;&#32102;&#20320; (&#30452;&#21040;&#20320;&#26377;&#24184;&#8221;&#22238;&#32769;&#23478;&#8221;)&#12290;&#21478;&#19968;&#31278;&#21063;&#26159;&#26377;&#32102;&#20184;&#19978;&#38480;&#30340; (&#20063;&#23601;&#26159;&#25152;&#35586;&#30340;&#8221;&#24115;&#25142;&#22411;&#8221;)&#65292;&#19981;&#31649;&#20320;&#30003;&#35531;&#29702;&#36064;&#24190;&#27425;&#12289;&#27599;&#19968;&#27425;&#35531;&#20102;&#22810;&#23569;&#37666;&#65292;&#21453;&#27491;&#20445;&#38570;&#20844;&#21496;&#26368;&#22810;&#26368;&#22810;&#20840;&#37096;&#21152;&#36215;&#20358;&#32317;&#20849;&#23601;&#21482;&#26371;&#20184;&#20320;&#20184;&#21040;&#19968;&#23450;&#30340;&#38989;&#24230;&#65292;&#22909;&#27604;&#35498;&#26159;&#19968;&#30334;&#33836; (&#30070;&#28982;&#36889;&#20491;&#25976;&#30446;&#24471;&#30475;&#20320;&#21040;&#24213;&#26159;&#20445;&#22810;&#23569;)&#12290;&#20854;&#23526;&#36889;&#27171;&#35611;&#36215;&#20358;&#65292;&#24115;&#25142;&#22411;&#20063;&#19981;&#35211;&#24471;&#23601;&#21487;&#20197;&#34987;&#31281;&#20316;&#26159;&#8221;&#32066;&#36523;&#8221;&#37291;&#30274;&#38570;&#20102;&#65292;&#22240;&#28858;&#24456;&#26377;&#21487;&#33021;&#25105;&#20497;&#20154;&#36996;&#27794;&#32769;&#23601;&#20808;&#29983;&#20102;&#22580;&#30149;&#65292;&#19968;&#19979;&#23376;&#25226;&#25152;&#26377;&#30340;&#38989;&#24230;&#37117;&#32102;&#29992;&#23436;&#20102;&#12290;</p><p>&#25105;&#24819;&#65292;&#21482;&#35201;&#26159;&#27491;&#24120;&#20154;&#23601;&#19968;&#23450;&#26371;&#36984;&#28961;&#19978;&#38480;&#30340;&#32066;&#36523;&#37291;&#30274;&#38570;&nbsp;&#8230; &#29702;&#30001;&#24456;&#31777;&#21934;&#22043;&#65292;&#26377;&#32102;&#20184;&#19978;&#38480;&#65292;&#33836;&#19968;&#25105;&#32769;&#20102;&#30495;&#30340;&#38656;&#35201;&#29992;&#30340;&#26178;&#20505;&#38989;&#24230;&#19981;&#22816; (&#22909;&#27604;&#35498;&#20845;&#21313;&#27506;&#21069;&#19981;&#23567;&#24515;&#29983;&#20491;&#30149;&#23601;&#29992;&#25481;&#20102;&#22235;&#21313;&#33836;&#65292;&#32080;&#26524;&#38989;&#24230;&#21482;&#21097;&#20845;&#21313;&#33836;&#65292;&#37027;&#20845;&#21313;&#27506;&#20197;&#24460;&#19981;&#23601;&#24471;&#30465;&#33879;&#40670;&#29992;&#20102;)&#65292;&#37027;&#25105;&#20445;&#36889;&#20491;&#38570;&#24185;&#22043;?!</p><p>&#19981;&#36942;&#36889;&#35041;&#26377;&#20491; -&#22909;- &#22750;&#28040;&#24687;: &#36942;&#20102;&#20170;&#24180;&#20197;&#24460;&#65292;&#28961;&#19978;&#38480;&#32066;&#36523;&#37291;&#30274;&#38570;&#23601;&#26371;&#35722;&#25104;&#27511;&#21490;&#21517;&#36781;&#20102;[&#185;]&#65292;&#20063;&#23601;&#26159;&#35498;&#65292;&#20445;&#38570;&#20844;&#21496;&#20197;&#24460;&#19981;&#26371;&#20877;&#36067;&#20102;&#12290;&#26082;&#28982;&#36023;&#19981;&#21040;&#20102;&#65292;&#25105;&#24819;&#20063;&#19981;&#29992;&#29305;&#21029;&#21435;&#20998;&#26512;&#65292;&#36889;&#35041;&#25105;&#23601;&#21482;&#25343;&#24115;&#25142;&#22411;&#30340;&#20358;&#20998;&#26512; (&#25105;&#27604;&#36611;&#25078;&#65292;&#21453;&#27491;&#20841;&#20491;&#37117;&#19981;&#21010;&#31639;&#65292;&#25361;&#31777;&#21934;&#19968;&#40670;&#30340;&#20358;&#27604;&#36611;&#23601;&#22909;&#20102;)&#12290;</p><p>&#24213;&#19979;&#25105;&#25343;&#21335;&#23665;&#20154;&#22781;&#30340;&#32066;&#36523;&#37291;&#30274;&#38570;&#20358;&#20316;&#20363;&#23376;&#65292;&#29992;&#30007;&#24615;&#12289;30 &#27506;&#12289;20 &#24180;&#26399;&#12289;10 &#21934;&#20301; (&#26085;&#38989; 1000) &#20358;&#20570;&#35430;&#31639;:</p><p><a href="http://spreadsheets.google.com/pub?key=pFc93M6y-5nSMLBEIT2qeYA&amp;output=html&amp;gid=2&amp;single=true&amp;widget=true">&#32066;&#36523;&#37291;&#30274;&#20445;&#38570; v.s. &#33258;&#34892;&#28310;&#20633;&#37291;&#30274;&#37329;</a></p><p>&#36889;&#20491;&#38570;&#21834;&#65292;&#31777;&#21934;&#20358;&#35611;&#23601;&#26159;&#20320;&#27599;&#24180;&#20184;&#32102;&#21335;&#23665; $27880&#65292;&#36899;&#32396;&#20184;&#20491; 20 &#24180;&#65292;&#20840;&#37096;&#20184;&#23436;&#20043;&#24460;&#21335;&#23665;&#23601;&#20445;&#20320;&#19968;&#36649;&#23376; (&#20294;&#26159;&#26368;&#39640;&#21482;&#33021;&#36064;&#21040; 100 &#33836;)&#12290;&#25563;&#31639;&#19968;&#19979;&#65292;&#31561;&#26044;&#25105;&#20497;&#32317;&#20849;&#20184;&#20986;&#21435; $557600&#65292;&#26368;&#24460;&#25563;&#22238;&#20358; 100 &#33836;&#30340;&#20445;&#38556;&#65292;&#30475;&#36215;&#20358;&#36996;&#19981;&#36084;&#22043;&#65292;&#19981;&#26159;&#21966;? &#22823;&#27010;&#26377; 80% &#24038;&#21491;&#30340;&#29554;&#21033;&#32822;~</p><p>&#20294;&#26159;&#30495;&#30340;&#26159;&#36889;&#27171;&#21966;?</p><p>&#20445;&#38570;&#20844;&#21496;&#31639;&#32102;&#20320;&#30475;&#30340;&#32080;&#26524;&#65292;&#36890;&#24120;&#26371;&#36984;&#25799;&#24615;&#30340;&#36986;&#28431;&#20102;&#24190;&#20491;&#24456;&#37325;&#35201;&#30340;&#22320;&#26041;&#65292;&#31532;&#19968;&#20491;&#23601;&#26159;&#24573;&#30053;&#20102;&#36890;&#33192;&#23565;&#20445;&#38989;&#30340;&#24433;&#38911;&#12290;&#20160;&#40636;&#26159;&#36890;&#33192;[&#178;]? &#21482;&#35201;&#22238;&#24819;&#19968;&#19979;&#23567;&#26178;&#20505;&#19968;&#30871;&#38525;&#26149;&#40629;&#21482;&#35201; 15 &#20803;&#23601;&#30693;&#36947;&#20102; (&#22909;&#21862;&#65292;&#20320;&#21507;&#21040;&#30340;&#26159; 20 &#20803;&#65292;OK?!)&#65292;&#29694;&#22312;&#19968;&#30871;&#35201;&#22810;&#23569;? &#27491;&#24120;&#24773;&#27841;&#19979;&#65292;&#25105;&#20497;&#25163;&#19978;&#30340;&#37666;&#26371;&#24840;&#20358;&#24840;&#34180;&#65292;&#20063;&#23601;&#26159;&#35498;&#65292;&#19968;&#27171;&#26159;&#26032;&#21488;&#24163; 15 &#20803;&#65292;20 &#24180;&#21069;&#33267;&#23569;&#36996;&#21507;&#24471;&#36215;&#19968;&#30871;&#40629;&#65292;&#29694;&#22312;&#21487;&#33021;&#36899;&#19968;&#30436;&#23567;&#33756;&#37117;&#21483;&#19981;&#36215;! &#22238;&#36942;&#38957;&#20358;&#24478;&#20445;&#38989;&#30340;&#35282;&#24230;&#20358;&#30475;&#65292;&#29694;&#22312;&#30340; 100 &#33836;&#30475;&#36215;&#20358;&#36996;&#19981;&#23569;&#65292;&#20294;&#26159; 20 &#24180;&#24460;&#21487;&#33021;&#23601;&#19981;&#31639;&#20160;&#40636;&#22823;&#37666;&#20102;&nbsp;&#8230;</p><p>&#25152;&#20197;&#21862;&#65292;&#19978;&#38754;&#37027;&#20491;&#34920;&#31532;&#20845;&#27396;&#65292;&#25105;&#25226;&#36890;&#33192;&#30340;&#22240;&#32032;&#20063;&#32771;&#24942;&#36914;&#20358;&#65292;&#30475;&#30475;&#23565; 100 &#33836;&#30340;&#20445;&#38989;&#26377;&#20160;&#40636;&#24433;&#38911;&#12290;&#25105;&#21482;&#29992;&#24179;&#22343;&#19968;&#24180; 2% &#30340;&#36890;&#33192;&#20358;&#35336;&#31639;&#65292;&#20197;&#36942;&#21435;&#20108;&#12289;&#19977;&#21313;&#24180;&#30340;&#32113;&#35336;&#20358;&#35498;&#65292;&#36889;&#24050;&#32147;&#31639;&#26159;&#20302;&#27161;&#20102;&nbsp;&#8230; &#25105;&#20497;&#20358;&#30475;&#30475;&#32371;&#23436; 20 &#24180;&#24460;&#65292;&#22240;&#28858;&#36890;&#33192;&#30340;&#38364;&#20418;&#65292;&#26412;&#20358;&#30340; 100 &#33836;&#20854;&#23526;&#21482;&#21097; 67 &#33836;&#30340;&#20729;&#20540;&#20102; (&#20063;&#23601;&#26159;&#35498;&#65292;&#19968;&#30334;&#33836;&#22312; 20 &#24180;&#24460;&#29992;&#36215;&#20358;&#65292;&#20854;&#23526;&#36319;&#20170;&#22825;&#29992; 67 &#33836;&#30340;&#24863;&#35258;&#26159;&#19968;&#27171;&#30340;)&#12290;&#36889;&#20854;&#23526;&#19981;&#38627;&#29702;&#35299;&#65292;&#20687;&#36889;&#19968;&#12289;&#20841;&#24180;&#25152;&#26377;&#29289;&#20729;&#37117;&#28466;&#20491;&#19981;&#20572; (&#27833;&#20729;&#12289;&#23567;&#21507;&#24215;&#12289;&#34907;&#29983;&#32025;&#12289;&#38272;&#35386;&#36027;&nbsp;&#8230;)&#65292;&#26412;&#20358;&#20841;&#24180;&#21069;&#30475;&#20491;&#30149;&#21482;&#35201;&#20841;&#12289;&#19977;&#30334;&#22602;&#23601;&#21487;&#20197;&#35299;&#27770;&#65292;&#29694;&#22312;&#33267;&#23569;&#26159;&#19977;&#12289;&#22235;&#30334;&#20803;&#36215;&#36339;&#65292;&#22825;&#30693;&#36947;&#20108;&#12289;&#19977;&#21313;&#24180;&#20197;&#24460;&#30475;&#20491;&#30149;&#35201;&#33457;&#22810;&#23569;&#37666;&#21834;!</p><p>&#25152;&#20197;&#21862;&#65292;30 &#27506;&#30340;&#26178;&#20505;&#24819;&#35498;&#36023;&#20491;&#32066;&#36523;&#37291;&#30274;&#38570;&#65292;&#20197;&#28858;&#21487;&#20197;&#36023;&#21040; 100 &#33836;&#30340;&#20445;&#38556;&#65292;&#20854;&#23526;&#21040;&#20102; 50 &#27506;&#21482;&#21097;&#19979;&#30456;&#30070;&#26044; 67 &#33836;&#30340;&#20445;&#38556;&nbsp;&#8230; &#26356;&#24920;&#30340;&#36996;&#22312;&#19979;&#38754;&#65292;&#22240;&#28858;&#26159;&#32066;&#36523;&#30340;&#38364;&#20418;&#65292;&#36996;&#24471;&#32380;&#32396;&#36890;&#33192;&#19979;&#21435;&#65292;60 &#27506;&#26178;&#21482;&#21097; 55 &#33836;&#65292;70 &#27506;&#26178;&#21482;&#21097; 45 &#33836;&#65292;80 &#27506;&#26178;&#21482;&#21097; 36 &#33836;&nbsp;&#8230; &#22909;&#24920;! &#19981;&#35201;&#24536;&#35352;&#21734;&#65292;&#36889;&#21487;&#26159;&#22312;&#24180;&#36629;&#26178; (30 &#33267; 50 &#27506;&#20043;&#38291;) &#29992;&#20102;&#32317;&#20849; $557600 &#36023;&#20358;&#30340;&#20445;&#38570;&#21834;~</p><p>&#24046;&#21029;&#22312;&#21738;&#35041;? &#24046;&#21029;&#22312;&#36889; 100 &#33836;&#30340;&#20445;&#38989;&#27794;&#26377;&#21033;&#24687;&#65292;&#19981;&#20687;&#23384;&#22312;&#37504;&#34892;&#25110;&#26159;&#33258;&#34892;&#29702;&#36001;&#65292;&#36889; 100 &#33836;&#26159;&#27515;&#30340;&#65292;&#26371;&#38568;&#33879;&#26178;&#38291;&#30340;&#27969;&#36893;&#12289;&#36890;&#33192;&#30340;&#20405;&#34645;&#65292;&#24930;&#24930;&#30340;&#23601;&#27794;&#20102;</p><p>&#35731;&#25105;&#20497;&#25563;&#20491;&#35282;&#24230;&#20358;&#24819;&#65292;&#22914;&#26524;&#25105;&#20497; 30 &#27506;&#30340;&#26178;&#20505;&#19981;&#36023;&#36889;&#20491;&#32066;&#36523;&#37291;&#30274;&#38570;&#65292;&#25913;&#29992;&#33258;&#24049;&#29702;&#36001;&#30340;&#26041;&#24335;&#20358;&#31820;&#20633;&#33258;&#24049;&#30340;&#37291;&#30274;&#28310;&#20633;&#37329;&#65292;&#32080;&#26524;&#26371;&#26377;&#20160;&#40636;&#19981;&#19968;&#27171;? &#20808;&#20445;&#23432;&#19968;&#40670;&#22909;&#20102;&#65292;&#25214;&#20491;&#23450;&#23384; (&#36889;&#22816;&#20445;&#23432;&#20102;&#21543;!)&#65292;&#29992;&#21151;&#19968;&#40670;&#21435;&#25214;&#20491;&#24180;&#21033;&#29575; 2.55% &#30340;&#23450;&#23384;&#65292;&#27599;&#24180;&#23384;&#30456;&#21516;&#30340; $27880 &#36914;&#21435;&#65292;&#36899;&#32396; 20 &#24180;&#12290;&#35531;&#27880;&#24847;&#65292;&#22312;&#36889;&#35041;&#25105;&#20063;&#25226;&#36890;&#33192;&#32102;&#31639;&#36914;&#20358;&#65292;&#25152;&#20197; 2.55% &#30340;&#23450;&#23384;&#23526;&#38555;&#19978;&#21482;&#21097; 0.55%&#65292;&#36889;&#27171;&#27604;&#36611;&#36215;&#20358;&#25165;&#20844;&#24179;&#12290;&#25105;&#20497;&#20358;&#30475;&#19968;&#19979;&#23450;&#23384;&#30340;&#23041;&#21147;&#26377;&#22810;&#22823;: 50 &#27506;&#30340;&#26178;&#20505;&#21487;&#20197;&#32047;&#31309;&#21040; 59 &#33836;&#65292;&#21999;&#65292;&#36996;&#19981;&#37679;&#65292;&#38614;&#28982;&#36996;&#26159;&#36664;&#32102;&#32066;&#36523;&#37291;&#30274;&#12290;&#19981;&#36942; 60 &#27506;&#21602;? 62 &#33836;&#21734;&#65292;&#23601;&#27604;&#32066;&#36523;&#37291;&#30274;&#38570;&#36890;&#33192;&#21040;&#36889;&#20491;&#26178;&#20505;&#30340; 55 &#33836;&#35201;&#39640;&#20102;&#19981;&#23569;&#21602;! &#23601;&#31639;&#26159;&#26368;&#26368;&#26368;&#20445;&#23432;&#30340;&#23450;&#23384;&#65292;&#21040;&#20102;&#22823;&#27010; 55 &#27506;&#26178;&#20063;&#37117;&#21487;&#20197;&#36629;&#36629;&#39686;&#39686;&#36111;&#36942;&#32066;&#36523;&#37291;&#30274;&#38570;! &#20877;&#32380;&#32396;&#24448;&#19979;&#30475;&#21602;? &#30495;&#30340;&#26159;&#27963;&#30340;&#24840;&#20037;&#65292;&#20445;&#32066;&#36523;&#37291;&#30274;&#38570;&#23601;&#24840;&#19981;&#21010;&#31639;&#21834;!</p><p>&#19981;&#36942;&#21602;&#65292;&#27794;&#26377;&#20154;&#36889;&#40636;&#20445;&#23432;&#30340;&#21862;&#65292;&#23450;&#23384;&#26159;&#19978;&#20491;&#19990;&#32000;&#22312;&#29992;&#30340;&#29702;&#36001;&#26041;&#24335;&#65292;&#29992;&#21151;&#19968;&#40670;&#65292;&#24456;&#36629;&#39686;&#23601;&#33021;&#25214;&#21040;&#25237;&#36039;&#37228;&#29575;&#27604;&#23450;&#23384;&#26356;&#39640;&#30340;&#22909;&#27161;&#30340;&#12290;&#27604;&#22914;&#35498;&#36023;&#31309;&#20778;&#32929;&#27599;&#24180;&#20358;&#38936;<a href="https://bryantsai.com/dividend-43a85947c5e3">&#32929;&#24687;</a>&#65292;&#36889;&#27171;&#23376;&#30340;&#32929;&#31080;&#24456;&#22810;&#21834;&#65292;&#26368;&#22823;&#30340;&#35201;&#27714;&#23601;&#22312;&#31337;&#23450;&#24615;&#65292;&#21488;&#22609;&#38598;&#22296;&#12289;&#20013;&#37628;&#12289;&#25110;&#26159;&#21488;&#28771; 50 &#31561;&#37117;&#24456;&#36969;&#21512;&#65292;&#27599;&#24180;&#27542;&#21033;&#29575;&#23569;&#35498;&#20063;&#26377; 4% &#30340;&#27700;&#28310;&#65292;&#25105;&#20497;&#23601;&#25343; 4% &#20358;&#35430;&#31639;&#22909;&#20102; (&#19968;&#27171;&#65292;&#25105;&#20497;&#25187;&#25481;&#36890;&#33192;&#30340; 2%): 50 &#27506;&#30340;&#26178;&#20505;&#23601;&#36111;&#36942;&#32066;&#36523;&#37291;&#30274;&#38570;&#20102;&#32822;~ &#21040;&#20102; 60 &#27506;&#30340;&#26178;&#20505;&#21487;&#20197;&#22810;&#20986;&#24555; 30 &#33836;&#65292;70 &#27506;&#30340;&#26178;&#20505;&#21487;&#20197;&#22810;&#20986;&#24555; 60 &#33836;&#65292;80 &#27506;&#30340;&#26178;&#20505;&#22810;&#20986;&#24555; 90 &#33836;!</p><p>&#35430;&#31639;&#34920;&#24460;&#38754;&#36996;&#26377; 6% &#36319; 10%&#65292;&#32102;&#21508;&#20301;&#21443;&#32771;&#19968;&#19979;&#65292;&#35611;&#32769;&#23526;&#35441;&#65292;6% ~ 10% &#20063;&#36996;&#19981;&#31639;&#38750;&#24120;&#36010;&#24515;&#30340;&#25237;&#36039;&#30446;&#27161;&#65292;&#25152;&#20197;&#25105;&#35498;&#32066;&#36523;&#37291;&#30274;&#38570;&#30495;&#30340;&#26159;&#24456;&#19981;&#21010;&#31639;&#65292;&#20063;&#38627;&#24618;&#20445;&#38570;&#27704;&#36960;&#26159;&#26368;&#36090;&#37666;&#30340;&#34892;&#26989;&#20043;&#19968;!</p><p>&#36889;&#27171;&#26377;&#27794;&#26377;&#19968;&#40670;&#24515;&#24471;&#20102;? &#26377;&#37666;&#25033;&#35442;&#35201;&#24819;&#36774;&#27861;&#35731;&#37666;&#33258;&#24049;&#21435;&#29983;&#37666;&#65292;&#25105;&#27794;&#26377;&#21483;&#22823;&#23478;&#37117;&#21435;&#28818;&#32929;&#31080;&#12289;&#29609;&#26399;&#36008;&#65292;&#19981;&#36942;&#21363;&#20351;&#26159;&#25918;&#23450;&#23384;&#20063;&#35201;&#27604;&#36023;&#32066;&#36523;&#38570;&#22909; (&#36996;&#26159;&#30433;&#37327;&#19981;&#35201;&#65292;&#22826;&#28010;&#36027;&#20102;&#65292;&#36215;&#30908;&#36023; REITS &#25110;&#26159;&#21488;&#28771; 50)&#12290;&#29305;&#21029;&#26159;&#25105;&#20497;&#36889;&#35041;&#35527;&#30340;&#26159;&#33258;&#24049;&#23559;&#20358;&#35201;&#20351;&#29992;&#30340;&#37291;&#30274;&#28310;&#20633;&#37329;&#65292;&#25976;&#23383;&#24046;&#36889;&#40636;&#22810;&#65292;&#23559;&#20358;&#21487;&#20197;&#29554;&#24471;&#30340;&#37291;&#30274;&#21697;&#36074;&#30456;&#23565;&#30340;&#20063;&#23601;&#26371;&#24046;&#36889;&#40636;&#22810;!</p><p>&#36996;&#19981;&#30456;&#20449;?! &#20877;&#35498;&#24190;&#20491;&#32066;&#36523;&#37291;&#30274;&#38570;&#19981;&#22909;&#30340;&#22320;&#26041;&#21152;&#24375;&#19968;&#19979;&#21508;&#20301;&#30340;&#20449;&#24515;&#12290;&#38500;&#20102;&#32102;&#20184;&#32317;&#38989;&#26377;&#19978;&#38480;&#22806;&#65292;&#22905;&#36996;&#26377;&#26085;&#38989;&#30340;&#38480;&#21046;&#65292;&#20687;&#25105;&#20497;&#36889;&#35041;&#35430;&#31639;&#23601;&#26159;&#29992;&#26085;&#38989; 1000 &#20358;&#31639;&#30340;&#65292;&#20063;&#23601;&#26159;&#35498;&#20445;&#38570;&#20844;&#21496;&#26368;&#22810;&#19968;&#22825;&#21482;&#20184; 1000 &#20803;&nbsp;&#8230; &#30070;&#28982;&#36996;&#26377;&#23526;&#25903;&#23526;&#20184;&#36319;&#26085;&#38989;&#22411;&#30340;&#20998;&#21029;&#65292;&#19981;&#36942;&#36889;&#35041;&#23601;&#31777;&#21270;&#19968;&#40670;&#22909;&#20102;&#12290;&#35201;&#26159;&#25105;&#21738;&#22825;&#29983;&#30149;&#20303;&#38498;&#65292;&#33457;&#20102; 1500 &#20803;&#65292;&#25265;&#27465;&#65292;&#37027;&#22810;&#20986;&#20358;&#30340; 500 &#20803;&#24471;&#33258;&#24049;&#20986;&#65292;&#35531;&#19981;&#21040;&#29702;&#36064;&#30340;&#12290;&#30070;&#28982;&#65292;&#26085;&#38989;&#26159;&#21487;&#20197;&#38568;&#33258;&#24049;&#21916;&#27489;&#35519;&#25972;&#30340;&#65292;&#25105;&#20497;&#20063;&#21487;&#20197;&#36984;&#25799;&#35201;&#26085;&#38989; 2000&#65292;&#21482;&#19981;&#36942;&#35201;&#32371;&#20841;&#20493;&#30340;&#37666;&#23601;&#26159;&#20102;&nbsp;&#8230; &#38500;&#27492;&#20043;&#22806;&#65292;&#37291;&#30274;&#38570;&#36996;&#26377;&#24456;&#22810;&#20854;&#20182;&#26377;&#30340;&#27794;&#30340;&#21046;&#38480;&#65292;&#20687;&#27599;&#19968;&#27425;&#25163;&#34899;&#21487;&#20197;&#32102;&#20184;&#30340;&#19978;&#38480;&#65292;&#36996;&#26377;&#21040;&#24213;&#20160;&#40636;&#30149;&#26377;&#29702;&#36064;&#12289;&#20160;&#40636;&#30149;&#21448;&#27794;&#26377;&#29702;&#36064;&#31561;&#31561;&#12290;&#25105;&#35201;&#35611;&#30340;&#26159;&#65292;&#38614;&#28982;&#26377; 100 &#33836;&#30340;&#20445;&#38556;&#22312;&#20445;&#38570;&#20844;&#21496;&#37027;&#35041;&#65292;&#20294;&#26159;&#19968;&#24819;&#21040;&#27599;&#27425;&#24819;&#21205;&#29992;&#37117;&#24471;&#30003;&#35531;&#65292;&#32780;&#19988;&#36996;&#26377;&#19968;&#22823;&#22534;&#30340;&#26781;&#27454;&#38480;&#21046;&#65292;&#25105;&#23601;&#35258;&#24471;&#36996;&#19981;&#22914;&#25226;&#37666;&#25918;&#22312;&#33258;&#24049;&#36523;&#37002;&#65292;&#38656;&#35201;&#29992;&#30340;&#26178;&#20505;&#33258;&#30001;&#21205;&#29992;&#65292;&#19981;&#38656;&#35201;&#30340;&#26178;&#20505;&#21448;&#33021;&#32147;&#30001;&#29702;&#36001;&#30340;&#26041;&#24335;&#22686;&#20540; (&#32780;&#19981;&#26159;&#25918;&#22312;&#20445;&#38570;&#20844;&#21496;&#37027;&#21482;&#26377;&#24930;&#24930;&#36022;&#20540;&#30340;&#20221;)&#65292;&#36889;&#27171;&#19981;&#26159;&#27604;&#36611;&#22909;&#21966;!?</p><p>&#22909;&#21543;&#65292;&#21040;&#36889;&#35041;&#23436;&#25104;&#20102;&#25105;&#30340;&#31532;&#19968;&#20491;&#32080;&#35542;&nbsp;&#8230;</p><p>&gt; &#33287;&#20854;&#36023;&#32066;&#36523;&#37291;&#30274;&#38570;&#65292;&#19981;&#22914;&#25226;&#37027;&#20491;&#37666;&#30041;&#19979;&#20358;&#65292;&#23601;&#31639;&#26159;&#25918;&#23450;&#23384;&#20063;&#22909;&#65292;&#24456;&#36629;&#39686;&#30340;&#23601;&#33021;&#22816;&#29554;&#24471;&#27604;&#36023;&#32066;&#36523;&#37291;&#30274;&#38570;&#26356;&#39640;&#30340;&#20445;&#38556; (&#20063;&#23601;&#26159;&#21487;&#20197;&#23384;&#19979;&#26356;&#22810;&#30340;&#37666;&#21862;!)</p><p>&#20877;&#20358;&#65292;&#23450;&#26399;&#37291;&#30274;&#38570;&#26377;&#20160;&#40636;&#19981;&#19968;&#27171;? &#28858;&#20160;&#40636;&#25105;&#26371;&#35498;&#26368;&#22909;&#35201;&#36023;?</p><p>&#29031;&#33879;&#25105;&#30340;&#31532;&#19968;&#20491;&#32080;&#35542;&#20570;&#65292;&#26368;&#22823;&#30340;&#39080;&#38570;&#23601;&#26159;&#8221;&#33836;&#19968;&#8221;&#36939;&#27683;&#23526;&#22312;&#19981;&#22909;&#65292;&#19977;&#21313;&#24190;&#12289;&#22235;&#21313;&#27506;&#23601;&#24471;&#20102;&#37325;&#30149;&#65292;&#29983;&#27963;&#19978;&#21487;&#33021;&#23601;&#26371;&#30332;&#29983;&#22256;&#38627;&#65292;&#26377;&#21487;&#33021;&#22240;&#28858;&#29983;&#30149;&#32780;&#19999;&#25481;&#24037;&#20316;&#12289;&#27794;&#26377;&#25910;&#20837;&#65292;&#20063;&#21487;&#33021;&#22240;&#28858;&#35201;&#27835;&#30290;&#32780;&#24471;&#33457;&#25481;&#22823;&#37096;&#20221;&#30340;&#31309;&#33988;&#12290;&#21839;&#38988;&#22312;&#36889;&#20491;&#38542;&#27573;&#37291;&#30274;&#28310;&#20633;&#37329;&#25165;&#21083;&#38283;&#22987;&#36215;&#27493;&#26681;&#26412;&#36996;&#19981;&#22816;&#20805;&#35029; (&#29992; 4% &#25237;&#36039;&#22577;&#37228;&#29575;&#20358;&#31639;&#65292;40 &#27506;&#26178;&#21482;&#32047;&#31309;&#21040; 31 &#33836;)&#65292;&#23601;&#31639;&#21083;&#22909;&#22816;&#29992;&#12289;&#21487;&#20197;&#25226;&#30149;&#37291;&#22909;&#65292;&#20294;&#26159;&#19968;&#20999;&#21448;&#24471;&#37325;&#20358; (&#37291;&#30274;&#37329;&#27512;&#38646;&#20102;&#24471;&#37325;&#26032;&#32047;&#31309;)&#65292;&#36889;&#27171;&#23376;&#20197;&#24460;&#30340;&#29983;&#27963;&#26371;&#35722;&#24471;&#24456;&#24920;!</p><p>&#19981;&#36942;&#36889;&#23601;&#26159;&#29983;&#27963;&#30340;&#39080;&#38570;&#65292;&#20854;&#23526;&#20063;&#23601;&#26159;&#28858;&#20160;&#40636;&#35201;&#20445;&#8221;&#38570;&#8221;&#30340;&#30495;&#27491;&#30446;&#30340;&#12290;&#25563;&#21477;&#35441;&#35498;&#65292;&#37291;&#30274;&#38570;&#30340;&#30906;&#26377;&#20182;&#30340;&#24517;&#35201;&#24615;&#65292;&#20294;&#26159;&#20063;&#19981;&#26159;&#35498;&#23601;&#19968;&#23450;&#35201;&#36023;&#32066;&#36523;&#22411;&#30340;&#37291;&#30274;&#38570;&#65292;&#22240;&#28858;&#20854;&#23526;&#36996;&#26377;&#21478;&#19968;&#31278;&#39006;&#22411;&#30340;&#65306;&#23450;&#26399;&#37291;&#30274;&#38570;&#12290;</p><p>&#25152;&#35586;&#30340;&#23450;&#26399;&#65292;&#26159;&#25351;&#36889;&#31278;&#20445;&#38570;&#30340;&#20445;&#38556;&#21482;&#26377;&#19968;&#23450;&#30340;&#26399;&#38291; (&#36890;&#24120;&#37117;&#26159;&#19968;&#24180;&#19968;&#26399;)&#65292;&#26178;&#38291;&#21040;&#20102;&#23601;&#24471;&#20877;&#36023;&#19968;&#27425;&#12290;&#19981;&#36942;&#20063;&#23601;&#22240;&#28858;&#20445;&#38570;&#26399;&#38291;&#22266;&#23450; (&#23565;&#20445;&#38570;&#20844;&#21496;&#32780;&#35328;&#39080;&#38570;&#27604;&#36611;&#23567;)&#65292;&#20445;&#36027;&#30456;&#23565;&#32780;&#35328;&#20063;&#23601;&#20415;&#23452;&#22810;&#20102;&#65292;&#36890;&#24120;&#19968;&#24180;&#30340;&#23450;&#26399;&#37291;&#30274;&#38570;&#36027;&#29992;&#21482;&#35201;&#25976;&#21315;&#20803;&#65292;&#36319;&#32066;&#36523;&#37291;&#30274;&#38570;&#21205;&#36626;&#25976;&#33836;&#27604;&#36611;&#36215;&#20358;&#24046;&#24456;&#22810;&#65292;&#20294;&#26159;&#22312;&#20445;&#38556;&#19978;&#21063;&#26159;&#27794;&#20160;&#40636;&#24046;&#21029; (&#30070;&#28982;&#35201;&#30475;&#20491;&#21029;&#21830;&#21697;&#30340;&#20839;&#23481;&#32780;&#23450;)&#12290;</p><p>&#25152;&#20197;&#21862;&#65292;&#37341;&#23565;&#25105;&#20497;&#21083;&#21083;&#25552;&#21040;&#30340;&#21839;&#38988;&#65292;&#21482;&#35201;&#37197;&#21512;&#23450;&#26399;&#37291;&#30274;&#38570;&#23601;&#33021;&#29554;&#24471;&#20445;&#38556;&#20102;&#12290;&#35731;&#25105;&#20877;&#20358;&#35430;&#31639;&#19968;&#19979;&#65292;&#25226;&#36023;&#32066;&#36523;&#37291;&#30274;&#38570;&#30340;&#37666;&#30465;&#19979;&#20358;&#33258;&#24049;&#29702;&#36001;&#65292;&#20877;&#21152;&#36023;&#23450;&#26399;&#37291;&#30274;&#38570;&#26371;&#26159;&#20491;&#20160;&#40636;&#27171;&#30340;&#24773;&#27841;:</p><p><a href="http://spreadsheets.google.com/pub?key=pFc93M6y-5nSMLBEIT2qeYA&amp;output=html&amp;gid=4&amp;single=true&amp;widget=true">&#32066;&#36523;&#37291;&#30274;&#38570; v.s. &#23450;&#26399;&#37291;&#30274;&#38570; + &#33258;&#34892;&#28310;&#20633;&#37291;&#30274;&#37329;</a></p><p>&#31777;&#21934;&#20358;&#35498;&#65292;&#25105;&#24478;&#27599;&#19968;&#26399;&#30340; $27880 &#35041;&#25343;&#20986; $2100 &#20358;&#25237;&#20445;&#23450;&#26399;&#37291;&#30274;&#38570;&#65292;&#21097;&#19979;&#30340; $25780 &#23384;&#36215;&#20358;&#20316;&#29702;&#36001;&#25237;&#36039; (&#23450;&#23384;&#12289;&#36023;&#31309;&#20778;&#32929;&nbsp;&#8230;)&#65292;&#36899;&#32396; 20 &#24180;&#65292;&#19981;&#36942;&#35201;&#27880;&#24847;&#30340;&#26159;&#23450;&#26399;&#37291;&#30274;&#38570;&#30340;&#20445;&#36027;&#26371;&#38568;&#33879;&#24180;&#40801;&#32780;&#30053;&#24494;&#22686;&#21152;&#65292;&#19981;&#36942;&#36996;&#26159;&#26371;&#22312;&#25976;&#21315;&#20803;&#30340;&#31684;&#22285;&#20839;&#12290;&#22914;&#26524;&#20197;&#25237;&#36039;&#22577;&#37228;&#29575; 4% &#20358;&#30475; (&#19981;&#35201;&#20877;&#24819;&#23450;&#23384;&#20102;&#65292;&#37027;&#23526;&#22312;&#26159;&#20445;&#23432;&#36942;&#38957;&#20102;&#65289;&#65292;&#21040;&#20102; 50 &#27506;&#30340;&#26178;&#20505;&#65292;&#20854;&#23526;&#33258;&#24049;&#21487;&#20197;&#32047;&#31309;&#21040;&#30340;&#37291;&#30274;&#28310;&#20633;&#37329; 61 &#33836;&#24050;&#32147;&#36319;&#32066;&#36523;&#37291;&#30274;&#38570;&#32147;&#36942;&#36890;&#33192;&#24460;&#30340; 67 &#33836;&#30456;&#24046;&#19981;&#22810;&#20102;&#65292;&#20107;&#23526;&#19978;&#24448;&#19979;&#30475;&#20491;&#24190;&#26684;&#65292;&#21482;&#35201;&#20877;&#31561;&#19977;&#24180;&#65292;&#21040;&#20102; 53 &#27506;&#30340;&#26178;&#20505;&#23601;&#24050;&#32147;&#36111;&#20102;&#21734;! &#29694;&#22312;&#25105;&#20497;&#19981;&#20294;&#26377;&#30456;&#21516;&#30340;&#20445;&#38556; (&#36879;&#36942;&#25237;&#20445;&#23450;&#26399;&#37291;&#30274;&#38570;)&#65292;&#32780;&#19988;&#25163;&#19978;&#30340;&#37291;&#30274;&#28310;&#20633;&#37329;&#20063;&#39640;&#26044;&#32066;&#36523;&#37291;&#30274;&#38570;&#65292;&#38617;&#36111;&#32822;! &#36889;&#27171;&#23376;&#36996;&#26377;&#20160;&#40636;&#29702;&#30001;&#38750;&#36023;&#32066;&#36523;&#37291;&#30274;&#38570;&#19981;&#21487;?</p><p>&#25152;&#20197;&#21862;&#65292;&#25105;&#30340;&#31532;&#20108;&#20491;&#32080;&#35542;&#20063;&#23436;&#25104;&#20102;</p><p>&gt; &#28858;&#20102;&#36991;&#20813;&#36939;&#27683;&#19981;&#20339;&#30340;&#8221;&#33836;&#19968;&#8221;&#30332;&#29983;&#65292;&#26368;&#22909;&#36023;&#20491;&#23450;&#26399;&#37291;&#30274;&#38570;&#65292;&#20195;&#20729;&#19981;&#39640;&#21371;&#21487;&#20197;&#32102;&#25105;&#20497;&#38750;&#24120;&#22823;&#30340;&#20445;&#38556;&#12290;</p><p>&#23450;&#26399;&#38570; (&#19981;&#31649;&#26159;&#37291;&#30274;&#38570;&#36996;&#26159;&#22781;&#38570;) &#20184;&#20986;&#21435;&#30340;&#20445;&#36027;&#21040;&#26399;&#20043;&#24460;&#23601;&#25343;&#19981;&#22238;&#20358;&#20102;&#65292;&#36889;&#25033;&#35442;&#24456;&#26126;&#39023;&#20063;&#24456;&#33258;&#28982;&#12290;&#20294;&#26159;&#24456;&#22810;&#20154;&#23565;&#26044;&#36889;&#40670;&#21371;&#28961;&#27861;&#37323;&#25079;&#65292;&#26368;&#24460;&#24448;&#24448;&#26371;&#36681;&#32780;&#25237;&#21521; (&#22312;&#20445;&#38570;&#20844;&#21496;&#30340;&#35480;&#23566;&#20043;&#19979;&nbsp;&#8230;) &#32066;&#36523;&#38570;&#30340;&#25079;&#25265;&#12290;&#24076;&#26395;&#36889;&#35041;&#30340;&#20998;&#26512;&#21487;&#20197;&#35731;&#22823;&#23478;&#23565;&#32066;&#36523;&#38570;&#30340;&#26412;&#36074;&#26377;&#27604;&#36611;&#28145;&#20837;&#30340;&#20102;&#35299;&#65292;&#36023;&#20445;&#38570;&#22043;&#65292;&#26412;&#20358;&#23601;&#26159;&#36023;&#19968;&#20491;&#20445;&#38556;&#65292;&#33836;&#19968;&#19981;&#24184;&#26377;&#38656;&#35201;&#29992;&#21040;&#65292;&#37027;&#26159;&#36996;&#22909;&#26377;&#20445;! &#35201;&#26159;&#8221;&#26377;&#24184;&#8221;&#24179;&#24179;&#23433;&#23433;&#19981;&#38656;&#35201;&#21205;&#29992;&#65292;&#37027;&#26159;&#32769;&#22825;&#20445;&#20305;&#65292;&#26356;&#25033;&#35442;&#39640;&#33288;&#36889;&#20445;&#36027;&#33457;&#24471;&#26377;&#20729;&#20540;&#12290;&#21315;&#33836;&#19981;&#35201;&#22240;&#28858;&#36889;&#31278;&#8221;&#24515;&#26377;&#19981;&#29976;&#8221;&#30340;&#24819;&#27861;&#32780;&#21435;&#36023;&#19968;&#22534;&#26681;&#26412;&#19981;&#26159;&#20445;&#38570;&#30340;&#20445;&#38570;!</p><p>[&#185;]: <a href="http://www.udn.com/2007/7/25/NEWS/STOCK/STO9/3942263.shtml">&#32879;&#21512;&#29702;&#36001;&#32178;: &#28961;&#19978;&#38480;&#32066;&#36523;&#37291;&#30274;&#38570; &#29105;&#36067;</a></p><p>[&#178;]: <a href="http://zh.wikipedia.org/w/index.php?title=%E9%80%9A%E8%B4%A7%E8%86%A8%E8%83%80&amp;variant=zh-tw">Wikipedia: &#36890;&#36008;&#33192;&#33081;</a></p>]]></content:encoded></item></channel></rss>