<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://www.dinhphu28.com/feed.xml" rel="self" type="application/atom+xml"/><link href="https://www.dinhphu28.com/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-06-12T08:16:00+00:00</updated><id>https://www.dinhphu28.com/feed.xml</id><title type="html">Dinh Phu Nguyen</title><subtitle>Dinh Phu Nguyen also know as dinhphu28 is a Software Engineer with high level experience in Java - Spring Boot, RESTful Web Service and software development. </subtitle><entry><title type="html">AI không thể nén một cục kiến thức khổng lồ vào não bạn</title><link href="https://www.dinhphu28.com/blog/2026/ai-khong-the-nen-mot-cuc-kien-thuc-khong-lo-vao-nao-ban/" rel="alternate" type="text/html" title="AI không thể nén một cục kiến thức khổng lồ vào não bạn"/><published>2026-06-12T06:30:00+00:00</published><updated>2026-06-12T06:30:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/ai-khong-the-nen-mot-cuc-kien-thuc-khong-lo-vao-nao-ban</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/ai-khong-the-nen-mot-cuc-kien-thuc-khong-lo-vao-nao-ban/"><![CDATA[<p>Nhiều người đang mặc định rằng dùng AI là có thể biến việc tiếp thu một vấn đề lớn trong nhiều ngày thành một ngày, thậm chí vài phút. Nhưng có một nhầm lẫn rất lớn ở đây: AI có thể tóm tắt, định hướng, lọc nhiễu, chỉ đường; nhưng AI không thể “nén” toàn bộ tri thức khổng lồ rồi nhét thẳng vào não chúng ta mà không mất mát gì.</p> <p>Nếu mục tiêu chỉ là overview, AI rất hữu ích. Bạn muốn biết Sherlock Holmes là gì? AI có thể nói: đó là loạt truyện về Holmes và Watson phá án bằng suy luận. Nhưng nếu bạn muốn thật sự hiểu câu chuyện, bối cảnh, tính cách nhân vật, các tình huống gay cấn, cách Holmes xử lý từng vụ án, vụ “dải băng lốm đốm”, kho báu Agra, Watson mất cả hòm kho báu nhưng lại có một kho báu khác là Mary, hay cảm giác khi Holmes bóc từng lớp sự thật ra trước mắt người đọc, thì vài dòng tóm tắt là không đủ. Cuối cùng, bạn vẫn phải đọc. Không phải vì AI vô dụng, mà vì trải nghiệm và chi tiết không thể được truyền nguyên vẹn bằng một đoạn nén siêu ngắn.</p> <p>Làm software cũng vậy. Bạn muốn overview Linux kernel? AI có thể tóm tắt rất nhanh. Nhưng nếu bạn thật sự muốn hiểu <code class="language-plaintext highlighter-rouge">io_uring</code>, cuối cùng bạn vẫn phải đọc tài liệu, đọc source code, hiểu vì sao nó tồn tại, nó giải quyết vấn đề gì, và người ta đã implement nó như thế nào. AI có thể giúp bạn tìm đúng chỗ nhanh hơn, giải thích khái niệm dễ hơn, giảm công sức lọc nhiễu hơn. Nhưng nó không thay thế việc bạn phải tự hiểu.</p> <p>Điểm quan trọng là: trước khi biết đến <code class="language-plaintext highlighter-rouge">io_uring</code>, bạn phải có một chuỗi câu hỏi trong đầu. CPU thực sự làm gì? Vì sao I/O blocking lại làm lãng phí tài nguyên? Thread là gì? Tại sao không tạo thật nhiều thread? Context switch tốn kém ra sao? Linux xử lý I/O như thế nào? Từ đó bạn mới đi đến <code class="language-plaintext highlighter-rouge">poll</code>, <code class="language-plaintext highlighter-rouge">epoll</code>, rồi <code class="language-plaintext highlighter-rouge">io_uring</code>.</p> <div class="row"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/discovery-nature-1-480.webp 480w,/assets/img/discovery-nature-1-800.webp 800w,/assets/img/discovery-nature-1-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/discovery-nature-1.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" title="quick lookup" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Những cánh cửa cần đi qua để đến `io_uring` </div> <p>AI có thể trả lời rất tốt khi bạn đã biết mình cần hỏi gì. Nhưng nếu trong đầu bạn chưa có xâu chuỗi của vấn đề, chưa biết có những “cánh cửa” nào tồn tại, thì AI thường chỉ đưa cho bạn một bản tóm tắt chung chung. Với người đã biết rồi, bản tóm tắt đó quá cơ bản. Với người chưa biết gì, nó lại giống như một bức tường trơn, trên đó có những cánh cửa không tên, cùng màu với tường, và họ thậm chí không biết phải gõ vào đâu.</p> <p>Nguồn gốc của việc học, tìm hiểu và khám phá vẫn không thay đổi. Con người vẫn phải tò mò, va chạm, đọc, thử, sai, gặp từ khóa mới, đặt câu hỏi mới, rồi tiếp tục đào sâu. Đôi khi ta biết đến một chủ đề không phải vì ta đã chủ động tìm nó từ đầu, mà vì tình cờ thấy trong một bài blog (ý là mấy ông lên đọc blog của tôi đi), một đoạn code, một môn học, một cuộc thảo luận, hoặc trong lúc đang tìm hiểu một thứ khác.</p> <p>AI làm thay đổi tốc độ tiếp cận, không làm thay đổi bản chất của việc hiểu. Trước đây ta Google từng từ khóa, mở từng tab, đọc và sàng lọc. Bây giờ AI giúp gom ý, giải thích, gợi ý hướng đi và rút ngắn phần nhiễu. Nhưng phần cốt lõi vẫn là của con người: tự hình thành câu hỏi, tự nhận ra vấn đề, tự đi qua chi tiết, và tự xây dựng mô hình trong đầu.</p> <p>AI không thay thế bộ não của chúng ta. Nó giống một người chỉ đường tốt hơn là một thiết bị truyền tri thức trực tiếp vào não. Nó có thể giúp ta đi nhanh hơn, nhưng không thể đi thay ta.</p> <p>Thấy bây giờ nhiều anh em thần thánh hóa AI quá, cứ tưởng nó là một thứ thần kỳ có thể biến mọi thứ thành mì ăn liền. Cái vấn đề này mình nói tới chắc khi nào có bánh mì trí nhớ của Doraemon hoặc Elon Musk làm xong được cái Neuralink sync thẳng thông tin vào não thì được. Dùng AI hỗ trợ thôi chứ não vẫn phải dùng của mình nha anh em. Một ngày không suy nghĩ là khó chịu lắm!</p>]]></content><author><name></name></author><category term="writing"/><category term="discovery"/><category term="learning"/><category term="ai"/><category term="software"/><category term="linux"/><summary type="html"><![CDATA[AI không thay thế bộ não của chúng ta. Nó giống một người chỉ đường tốt hơn là một thiết bị truyền tri thức trực tiếp vào não. Nó có thể giúp ta đi nhanh hơn, nhưng không thể đi thay ta.]]></summary></entry><entry><title type="html">Making a Fragile System</title><link href="https://www.dinhphu28.com/blog/2026/making-a-fragile-system/" rel="alternate" type="text/html" title="Making a Fragile System"/><published>2026-06-01T08:22:00+00:00</published><updated>2026-06-01T08:22:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/making-a-fragile-system</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/making-a-fragile-system/"><![CDATA[<h2 id="i-will-support-you-but-i-dont-want-to-build-the-feature">I will support you, but I don’t want to build the feature</h2> <p>Some folks develop software then willing to support to users for the features they built. Jack (an imaginary software engineer) add a new feature that he thought this usually reads and nearly never writes, so he just write the API to read data from database, but not write to database. He willing to write to database directly every time user need to update the data.</p> <p>QA team test the feature, he willing to support them to write to database. With users, he also willing too. He build so may features like this over the time because of thinking that it spends less time to build the feature.</p> <p>Few months later, people ask him to support is growing more and more. He doesn’t even have time to do anything except supporting. So now Jack is as a support guy, not a software engineer anymore. The system is not itself, but Jack is the system LoL.</p> <p>Jack will have to work at night, at weekend, and even at holiday to support users. He is so tired and stressed out, but he has to do it because of his promise to support users. He can’t even take a break or go on vacation because of the support work.</p> <p>But the paradox is Jack is considered as a hero by users, his boss, and his colleagues because of his dedication to support users. He is praised for his hard work and commitment.</p> <p>The sunset of the system is near, some day in the future. No working by itself, no one knows what’s going on, no scale, no maintainability, no one want to use it, and no one want to support it. It’s now a fragile system.</p> <h2 id="the-reason-why-we-make-software">The reason why we make software</h2> <p>We know that the human-cost is the most expensive cost in software development. Jack has violated the basic principle of not only software engineering but also the industry: People make machine to reduce human work, not the opposite.</p> <p>But the problem is not only Jack, why his boss allow him to do that? Why his colleagues don’t stop him? Why the users don’t stop him? Why the QA team don’t stop him?</p> <p>I think many of us have been Jack at some point in our career, especially when we are new to the industry.</p> <h2 id="the-iceberg-of-software-development">The iceberg of software development</h2> <p>Some folks throw exceptions but never handle them, they just catch them at the end by default and let the system show “Something went wrong” to users.</p> <p>One day users message us and said feature didn’t work, we open the logs and see so many exceptions, but we don’t know which one belongs to this case. And even if we know which one, we don’t know what happened because of “Something went wrong” message, the use-less logs LoL.</p> <p>Jack thinks the task is done when it’s working anyway. The “done” is just the head of the iceberg.</p> <p>Error handling, logging, monitoring, alerting, documentation, testing, scalability, maintainability, security, and many other things are the body of the iceberg that we can’t see but we have to deal with. It’s technical debt that we have to pay later.</p> <p>Stakeholders don’t see the iceberg, they only see the head of the iceberg, so they think the task is done when it’s working anyway. They don’t care about the technical debt, they just want to see the result.</p> <h2 id="what-should-we-do">What should we do?</h2> <p>The responsibility of a professional software engineer is argue with stakeholders to make them understand the iceberg. Think what we do is not “make it work”, but “make it work well”, think at the point of view of users, not only the end users, but also the support users. And especially, think by ourselves, might we are Jack in the future.</p> <h2 id="symptoms-of-a-fragile-system">Symptoms of a fragile system</h2> <ul> <li>Tight Coupling: The components are highly dependent on each other, making them hard to change or replace without affecting each other.</li> <li>Rigid Architecture: The system is hard to adapt to new requirements without a huge refactor or rewrite.</li> <li>Lack of Testing: Some folks just write code without writing tests, or just for satisfy the CI/CD pipeline, which allows hidden bugs.</li> <li>Many of Technical Debt: It’s the story of Jack.</li> <li>Poor or No Separate Concerns: The code is a mess of business logic, access control, validation and other concerns, making it hard to understand and maintain.</li> </ul> <h2 id="key-takeaways">Key takeaways</h2> <p>OK, we can follow the SOLID principles, use design patterns, and write clean code, but if we don’t have the right mindset, we will still end up with a fragile system.</p> <p>Always think ourselves are the users, other developers who will maintain the system in the future.</p> <p>I have the quote for you, “Don’t let current work takes your time in the future”.</p> <p>If what you’ve done is for end users, put yourself in their shoes. If your code is used by other developers even yourself, they are also users.</p> <p>Don’t do anything with the mindset of “I will support them”, “I will explain to them” or “Do it later”. Remember that “Later” is never! Professional software engineers will do it right and aware of the iceberg, not just “complete the task”.</p> <p>Yeah, of course, no one want others developer read our code then say “what’s the f**k messy things”, at least.</p>]]></content><author><name></name></author><category term="software"/><category term="software-engineering"/><category term="clean-code"/><category term="discipline"/><category term="professionalism"/><summary type="html"><![CDATA[Some folks just want to make it work, but they don't care about the iceberg of software development, which leads to a fragile system. Let me tell you the story of Jack, who is a software engineer but becomes a support guy because of his mindset.]]></summary></entry><entry><title type="html">Tại sao mình xài terminal trong hầu hết công việc làm software</title><link href="https://www.dinhphu28.com/blog/2026/tai-sao-minh-xai-terminal/" rel="alternate" type="text/html" title="Tại sao mình xài terminal trong hầu hết công việc làm software"/><published>2026-05-13T07:19:00+00:00</published><updated>2026-05-13T07:19:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/tai-sao-minh-xai-terminal</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/tai-sao-minh-xai-terminal/"><![CDATA[<h2 id="vấn-đề">Vấn đề</h2> <p>Anh em có bao giờ trải qua cảm giác app mình đang xài thay đổi bộ UI mới chưa, cái button hồi giờ anh em xài không còn ở đó nữa mà trôi đi đâu mất.</p> <p>Hoặc anh em cất công cả buổi trời để mò ra chỗ setup cái IntelliJ Idea sao cho nó nhận env, cái nút <strong>run</strong> nó chạy đúng ý anh em rồi phải capture cái màn hình, note lại để sau này đỡ mất công mò lại từ đầu. Và tới lúc đó nó cập nhật bộ UI mới. Y chang cái cách mà anh em đang xài Windows 7 lên 10, 11 vậy.</p> <p>Chưa kể mỗi framework, ngôn ngữ hay công cụ mà anh em đang xài nó lại cấu hình một kiểu khác nhau. Tại sao phải đau đầu và mất thời gian cho mấy thứ đó, dành thời gian làm những thứ khác hiệu quả hơn đi. Nên từ ngày đó mình đã chuyển sang làm việc với TUI nhiều hơn.</p> <h2 id="viết-note-cho-những-lần-sau">Viết note cho những lần sau</h2> <p>Với CLI, mọi thứ anh em sử dụng là lệnh, do đó đương nhiên là có thể lưu lại dưới dạng text dễ dàng để lần tới tham khảo thay vì phải capture màn hình rồi thêm một đống mô tả cho cái hình đó.</p> <p>Thậm chí nếu nhiều bước anh em cứ viết hẳn thành một file script, tới đó chạy script đó là được, khỏi phải click click dài dòng. Như mình viết sẵn luôn một bộ script cho Linux, khi nào cài máy mới thì chạy một cái một là xong.</p> <h2 id="coding">Coding</h2> <p>Mình làm Java và mỗi lần mở cái IntelliJ lên là nó cắn hẳn một khúc RAM của mình, chưa kể lúc mới khởi động lên bao lag. Đang code Java quay qua làm Typescript thì sao, mở VS Code. À thì xài VS Code cho mọi thứ cũng được =)))</p> <p>Có điều mình thấy việc đang gõ phím mà lâu lâu phải bỏ tay ra cầm vô con chuột để di chuyển tới chỗ này chỗ kia, click click các thứ thì khá là mất thời gian nên mình xài Vim. Đương nhiên là anh em xài VS Code, IntelliJ cũng có Vim mode.</p> <p>Mình thì xài tmux + NeoVim nên gần như không phải đụng đến chuột nếu không cần phải ra khỏi terminal.</p> <h2 id="setup-công-việc-của-mình">Setup công việc của mình</h2> <p>Với tmux, mình có thể chuyển qua lại giữa các windows, panes. Mỗi window, pane tương đương một app hoặc nhóm công việc.</p> <p>Thay vì xài Postman hay Insomnia, mình xài <strong>curl</strong>, để format JSON cho đẹp thì có thể sử dụng <strong>jq</strong>, nó cũng cho phép truy vấn JSON nên khá tiện.</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-X</span> <span class="s1">'https://example.com/articles'</span> <span class="se">\</span>
	<span class="nt">-H</span> <span class="s1">'x-api-key: apikey123'</span> | jq
</code></pre></div></div> <p>Anh em xài Linux hay Mac cũng không lạ gì mấy lệnh cơ bản như truy cập file system.</p> <p>Di chuyển tới thư mục nhanh thì dùng <strong>fzf</strong>, grep file thì có <strong>ripgrep</strong>.</p> <p>Thực ra mình cũng ko cần nhớ hết mấy cái lệnh phức tạp hay xài đâu, có thể dùng tổ hợp phím <strong>^R</strong> với <strong>fzf</strong> để tìm kiếm nhanh lệnh đã xài trong history thay vì gõ lại từ đầu.</p> <p>Chưa kể có thể kết hợp với <strong>oh-my-zsh auto-complete</strong> gõ vài chữ đầu là nó hiện đầy đủ rồi.</p> <p>Mình có sẵn server để ở nhà nên việc sử dụng chủ yếu bằng terminal này khá tiện, cần gì cứ SSH vô là được. Bản chất mọi thứ mình làm là trên server nên đang làm dở thì cứ đóng máy lại, lần tới SSH vô lại mở tmux session lên là có thể tiếp tục đúng chỗ đó.</p> <p>Nghĩa là chỉ cần anh em có một thiết bị có thể SSH là đủ, thậm chí là điện thoại iPad cũng có thể code được.</p> <p>Đôi lúc đi xa mà muốn mang đồ gọn trong lúc vẫn có thể xử lý được công việc thì với mình iPad là đủ. Dĩ nhiên là điện thoại cũng được nhưng nó quá nhỏ và mình vẫn chưa muốn thay mắt mới =)))</p> <h2 id="kết">Kết</h2> <p>Mình chọn terminal không hẳn là vì bài xích các app sử dụng GUI, đơn giản là nó phù hợp và thuận tiện với cách làm việc của cá nhân mình. Mấy cái app dùng truy vấn SQL databases thì mình vẫn dùng DataGrip, mấy cái mysql client hay psql CLI chữa cháy thôi.</p> <p>Đối với một số anh em thì xài terminal có vẻ cool, cũng là một lý do để anh em lựa chọn.</p> <p>Mình viết bài này mục đích chia sẻ trải nghiệm cá nhân. Anh em nào hứng thú hoặc có setup làm việc nào thú vị có thể để lại comments dưới đây.</p>]]></content><author><name></name></author><category term="utilities"/><category term="terminal"/><category term="vim"/><category term="productivity"/><category term="linux"/><category term="mac"/><summary type="html"><![CDATA[Anh em làm software nói riêng và IT nói chung đều có những công cụ ruột, ông thì xài IntelliJ, ông thì VS Code, và hàng tá app khác như Postman, Git Kraken. Mỗi app một cách xài khác nhau, cập nhật GUI cái là không biết cái mình đang xài đi đâu, riêng mình thì chọn làm việc với terminal.]]></summary></entry><entry><title type="html">Optimistic and Pessimistic Locking</title><link href="https://www.dinhphu28.com/blog/2026/optimistic-and-pessimistic-locking/" rel="alternate" type="text/html" title="Optimistic and Pessimistic Locking"/><published>2026-05-12T10:30:00+00:00</published><updated>2026-05-12T10:30:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/optimistic-and-pessimistic-locking</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/optimistic-and-pessimistic-locking/"><![CDATA[<h2 id="problem">Problem</h2> <p>In software development, we can’t avoid the resource contention problem.</p> <p>Tony and Jack are two users of a blogging platform. One day, dinhphu28 tells Tony to change the ‘untitled’ post to ‘Optimistic Locking in Java’ because the mistake when creating the post. Everything must be done before 7PM because the post will be published at that time.</p> <p>Tony start editing the title at 5:00 PM and save it at 5:01 PM. Fur sure, he has reloaded the page and see the title is changed as he wanted. But Jack, who is the editor of the post, also see the title is ‘untitled’ and starts editing it at the same time. Jack changes the title to ‘Pessimistic Locking in Java’ and saves it at 5:30 PM because he is busy to bathe his cat and forget to save it.</p> <p>When Tony click the save button, he believes that his change is correct and should be saved. Then he prepares to the date with his girlfriend.</p> <p>At 7 PM, While dating with his girlfriend, Tony receives a message from dinhphu28 that why the title is changed to ‘Pessimistic Locking in Java’ instead of ‘Optimistic Locking in Java’ as he requested.</p> <p>“What’s the f**k is going on? I just save it at 5:01 PM. Why the title is changed to ‘Pessimistic Locking in Java’?” Tony asks.</p> <pre><code class="language-mermaid">---
title: Problem
---

sequenceDiagram
    participant T as Tony
    participant B as Blog Platform
    participant J as Jack

    T -&gt;&gt; B: Update title
    Note right of T: Optimistic Locking in Java

    J -&gt;&gt; B: Update title
    Note left of J: Pessimistic Locking in Java

    B --&gt;&gt; T: Result
    Note left of T: Optimistic Locking in Java

    B --&gt;&gt; J: Result
    Note right of J: Pessimistic Locking in Java

    T -&gt;&gt; B: Get title
    B --&gt;&gt; T: Response
    Note left of T: Pessimistic Locking in Java&lt;br/&gt;(Not as expectation)
</code></pre> <h2 id="how-to-solve-the-problem">How to solve the problem?</h2> <p>The idea is to use locking mechanism to prevent the resource contention problem. In this case, we have two approach:</p> <ul> <li> <p>Optimistic Locking: Tony and Jack can edit the post at the same time, but when they save it, the system will check if the post is changed by another user. If it is changed, the system will reject the change and ask the user to reload the page and edit again.</p> </li> <li> <p>Pessimistic Locking: When Tony starts editing the post, the system will lock the post and prevent Jack from editing it until Tony saves it or cancel the edit.</p> </li> </ul> <pre><code class="language-mermaid">---
title: Optimistic Locking
---

sequenceDiagram
    participant T as Tony
    participant B as Blog Platform
    participant J as Jack

    T -&gt;&gt; B: Get article
    B --&gt;&gt; T: Response: 'untitle' (version 1)
    J -&gt;&gt; B: Get article
    B --&gt;&gt; J: Response: 'untitle' (version 1)

    T -&gt;&gt; B: Update title
    Note right of T: Optimistic Locking in Java&lt;br/&gt;(version 2)
    J -&gt;&gt; B: Update title
    Note left of J: Pessimistic Locking in Java&lt;br/&gt;(version 2)

    B --&gt;&gt; T: Result
    Note left of T: Optimistic Locking in Java&lt;br/&gt;(version 2)
    B --&gt;&gt; J: Response error
    Note over B,J: Conflict version on update

    T -&gt;&gt; B: Get title
    B --&gt;&gt; T: Response
    Note left of T: Optimistic Locking in Java&lt;br/&gt;(As expectation)
</code></pre> <pre><code class="language-mermaid">---
title: Pessimistic Locking
---

sequenceDiagram
    participant T as Tony
    participant B as Blog Platform
    participant J as Jack

    T -&gt;&gt; B: Update title
    Note right of T: Optimistic Locking in Java&lt;br/&gt;(Acquired lock on the article)

    J -&gt;&gt; B: Update title
    Note left of J: Pessimistic Locking in Java
    Note over J, B: Cannot update because of locking

    B --&gt;&gt; J: Response error

    B --&gt;&gt; T: Result
    Note left of T: Optimistic Locking in Java

    T -&gt;&gt; B: Get title
    B --&gt;&gt; T: Response
    Note left of T: Optimistic Locking in Java&lt;br/&gt;(As expectation)
</code></pre> <h2 id="pros-and-cons">Pros and Cons</h2> <h3 id="optimistic-locking">Optimistic Locking</h3> <ul> <li>Pros: <ul> <li>Better performance because it allows concurrent editing.</li> <li>Suitable for scenarios where conflicts are rare.</li> </ul> </li> <li>Cons: <ul> <li>Can lead to data loss if multiple users edit the same resource simultaneously.</li> <li>Users may have to redo their work if their changes are rejected.</li> </ul> </li> </ul> <h3 id="pessimistic-locking">Pessimistic Locking</h3> <ul> <li>Pros: <ul> <li>Prevents data loss by ensuring that only one user can edit the resource at a time.</li> <li>Suitable for scenarios where conflicts are common.</li> </ul> </li> <li>Cons: <ul> <li>Can lead to performance issues due to locking, especially if users hold locks for a long time.</li> <li>Can cause deadlocks if not implemented carefully.</li> </ul> </li> </ul> <h2 id="conclusion">Conclusion</h2> <p>Of course, the update process just takes a fraction of second, not long as the example above. But the problem is still the same, when multiple users edit the same resource at the same time. Choose the right locking mechanism based on the specific use case and requirements of your application is crucial to ensure data integrity and performance.</p>]]></content><author><name></name></author><category term="software"/><category term="java"/><category term="concurrency"/><category term="system-design"/><category term="race-condition"/><summary type="html"><![CDATA[Optimistic and Pessimistic Locking are two common approaches to handle resource contention in software development. This article explains the concepts, pros and cons of each approach, and how to choose the right one for your application.]]></summary></entry><entry><title type="html">Transactional and Performance Problem</title><link href="https://www.dinhphu28.com/blog/2026/transactional-and-performance-problem/" rel="alternate" type="text/html" title="Transactional and Performance Problem"/><published>2026-05-06T10:30:00+00:00</published><updated>2026-05-06T10:30:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/transactional-and-performance-problem</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/transactional-and-performance-problem/"><![CDATA[<p>When we work with with function requires atomic but not just only database works.</p> <p>E.g. Saving new blog article with 2 step:</p> <ul> <li>Insert to Database</li> <li>Send notification</li> </ul> <p>What happened when send notification failed? We have record saved in database without notification.</p> <p>Client code receives error but database insert still effects.</p> <p>The solution is treat both of 2 step as transaction like this:</p> <pre><code class="language-mermaid">---
title: Save new blog article
---

graph TD
    startTrans["Start database transaction"]
    ins["Insert query"]
    insSuccess{"Insert success?"}
    sendNoti["Send notification"]
    sendNotiSuccess{"Send success?"}
    commitTrans["Commit transaction"]
    rollback["Rollback"]
    E["End"]

    startTrans --&gt; ins
    ins --&gt; insSuccess
    insSuccess --&gt;|true| sendNoti
    insSuccess --&gt;|false| rollback
    sendNoti --&gt; sendNotiSuccess

    sendNotiSuccess --&gt;|true| commitTrans
    commitTrans --&gt; E

    sendNotiSuccess --&gt;|false| rollback
    rollback --&gt; E
</code></pre> <p>Despite data integrity, in this approach, the database must keep the connection during process. If <strong>Send notification</strong> takes so much time, this connection will be locked until send completely. Lead to multiple processes like this will consume almost database connection and make huge impact on performance, scalability.</p> <p>It’s just an example about the transactional problem. Reality, we can accept the notification failure as long as the database insert successfully. Because in the blog use-case, the notification is not too important to strict like this.</p> <p>In some system which traffic is not high, monolith, most of critical feature is work on the same database, this is the easiest way to resolve the problem.</p> <p>For microservices or high-scale systems, we must try to break the rule that everything must be in one transaction. And we can use Outbox pattern or Saga instead.</p> <p>No approach is the best solution, choose the most suitable for your use case and accept the trade-off.</p>]]></content><author><name></name></author><category term="software"/><category term="software-engineering"/><category term="transactional"/><category term="system-design"/><category term="performance"/><summary type="html"><![CDATA[In software engineering, sometimes we encounter a situation where we need to perform multiple operations that must either all succeed or all fail together. This is known as a transactional problem. How can we deal with it and what's the trade-off between different approaches?]]></summary></entry><entry><title type="html">Software Quality by Nature: The relationship between Software Engineering, Mathematics and Science</title><link href="https://www.dinhphu28.com/blog/2026/software-quality-by-nature-the-relationship-between-software-engineering-mathematics-and-science/" rel="alternate" type="text/html" title="Software Quality by Nature: The relationship between Software Engineering, Mathematics and Science"/><published>2026-05-05T06:42:00+00:00</published><updated>2026-05-05T06:42:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/software-quality-by-nature-the-relationship-between-software-engineering-mathematics-and-science</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/software-quality-by-nature-the-relationship-between-software-engineering-mathematics-and-science/"><![CDATA[<p>We write software and test it day by day, but do you ever think why we need to test the software? How many time do we need to test is enough?</p> <p>Is testing only way to verify the correctness of software?</p> <p>Some people don’t know why their product is so buggy. Is reason that they don’t care about the quality of their software? Or they don’t know how to improve the quality of their software?</p> <p>Do you put yourself in the position of very high critical software that every failure can cause a huge loss, such as the software for aerospace, nuclear power plant, or medical devices?</p> <p>Right, that is the cost we have to pay for the quality of software! This makes me remember the Conway’s Law lol.</p> <h2 id="the-relationship-between-software-engineering-mathematics-and-science">The relationship between Software Engineering, Mathematics and Science</h2> <p>We all know to prove a function in software is correct, we must decompose it into smaller pieces, and prove each piece is correct, recursively. Until all of them are provable units (primitives), we can ensure the original function is correct.</p> <p><strong>Do you realize that this is exactly the same as the Mathematical proof?</strong> We decompose a theorem into smaller lemmas, and prove each lemma is correct, recursively. Until all of them are provable unit (axioms), we can prove the original theorem is correct. And this is the key strategy of Formal Methods.</p> <p><strong>But nowadays, why Formal Methods is not widely used in software development?</strong> This is because of the complexity of software, it takes so much cost to prove each small pieces of software. When the cost for business is more important than the cost for quality, we have to make a trade-off between them.</p> <h3 id="science">Science</h3> <p>How can we know a scientific theory is correct? The answer is: we can’t know for sure, but we can test it with experiments.</p> <p>Why I say that? Everybody knows that an Apple always falls down, but how we can be sure that?</p> <p>You can said that you have seen it many times, but how many times is enough? You cannot be sure a day in the future, an Apple will fall up, or it will float in the air.</p> <p>Because of the gravity? But how do you know the gravity acceleration direction is always down? Do you know how the gravity works in the space? Nowadays, the essence of gravity is still a mystery, we only know how it works, but we don’t know why it works.</p> <p>Back to the day the Apple falls down on Newton’s head. He and human knew that the Apple always falls down because they have seen it millions of times. But they didn’t know why.</p> <p>People have considered the Newton’s gravity is correct until Einstein’s theory of relativity comes out.</p> <p>Do you realize that, in science, the theories and laws cannot be proven correct. They can be prove wrong when we have any counter-example that can disprove them.</p> <p>E.g.</p> \[F_n = 2^{2^n} + 1\] <p>Fermat conjectured that where \(n\) is a non-negative integer, \(F_n\) is always a prime number.</p> <p>But one day, Euler showed that \(F_5=2^{2^5} + 1 = 4\,924\,967\,297 = 641 \times 4\,700\,417\) and this is composite.</p> <p>Just one case is enough to prove that <strong>Fermat numbers</strong> is wrong.</p> <h3 id="software-verification-in-modern-days">Software Verification in modern days</h3> <p>As I have said, because of the hardness of mathematical proof, nowadays, we usually use scientific method to verify the software. That means we assume the software is correct, and we try to find any wrong case that can disprove it. People call this process <strong>testing</strong>.</p> <p>Obviously, we accept that we cannot be sure that the software is completely correct. But in practice, we can be sure that the software is good enough for our use, if we cannot find any wrong case after testing it with a lot of cases.</p> <p>So how can we best prove that the software is good enough for our use? Just test the highest level of software? Absolute no, because if we only test the highest level of software, we cannot be sure that the lower level of software is correct. We can said the higher level is correct that based on a wrong thing. That why we need to test every level of software, from the lowest level to the highest level.</p> <h2 id="is-the-formal-methods-dead">Is the Formal Methods dead?</h2> <p>No, in some critical system, such as the aerospace, the cost of failure is so high that we have to use Formal Methods to ensure the correctness of software. IBM also use Formal Methods for their Customer Information Control System (CICS).</p> <h2 id="conclusion">Conclusion</h2> <p>The method we use to verify the software nowadays depends on the cost of failure and the cost of verification. Even if just testing, the quality of software reflects the importance of cost of failure we spending on it. For some software that the release speed, fancy features are more important than the quality, people usually accept the lower quality and ok with the risk of failure.</p> <p>As a software engineer, my opinion is that we should always try to improve the quality despite the cost of failure is not high. This is the professionalism.</p> <p>Especially in the era of AI, the software is more and more powerful, and the cost of failure is also increasing. If we can’t control what the AI produce, the cost of failure can be huge. Certainly, nobody want their product explodes and becomes a mess, right?</p>]]></content><author><name></name></author><category term="software"/><category term="quality-assurance"/><category term="testing"/><category term="formal-methods"/><category term="software-engineering"/><category term="ai"/><summary type="html"><![CDATA[People make software, write code and test it day by day, but is there anyone really thinking about the essence of software quality? In this article, I will talk about the relationship between Software Engineering, Mathematics and Science, and how we can use them to verify the software.]]></summary></entry><entry><title type="html">Adapter Pattern and Applications</title><link href="https://www.dinhphu28.com/blog/2026/adapter-design-pattern/" rel="alternate" type="text/html" title="Adapter Pattern and Applications"/><published>2026-04-21T03:56:00+00:00</published><updated>2026-04-21T03:56:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/adapter-design-pattern</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/adapter-design-pattern/"><![CDATA[<h2 id="problem">Problem</h2> <p>In development routines, we usually work with so many incompatible interfaces such as 3rd-party APIs, libraries, legacy code.</p> <p>Some folks write code with tangled business logic with 3rd-party APIs. E.g: Use the same model for both business and payload. Lead to they need the APIs must me completed and ready to use.</p> <p>But what happened if:</p> <ul> <li>Our code and these APIs are written simultaneously</li> <li>One of them need to change in future</li> </ul> <p>Right! The first one, we must wait to the APIs completed and the overall task duration will stretch out.</p> <p>Instead of do nothing and make hasty when the deadline arrives. Just do as below:</p> <p>Because when we code our business, we have known what we need, like inputs, outputs. We can write interfaces, then the all business beyond them, test independently, write mockup implementation if needed. Don’t need to wait until the APIs completed.</p> <p>Then when the APIs is on our hand, just do:</p> <ul> <li><strong>Learning Test</strong>: To know and verify that all feature we need has worked properly.</li> <li><strong>Write Connector</strong>: Write the connector which match exactly only the APIs. (Remember to delete the mockup)</li> <li><strong>Write Adapter</strong>: So now connect the interface with connector with just the adapter.</li> </ul> <p>The second, the change. If we let it happened, we have violated the Open/Closed Principle. Mean when we need to change anything, we must change all of relative code. It’s a huge impact, and takes us so much time to change.</p> <p>With Adapter Pattern, we just need to change only one side and make the adapter compatible.</p> <h2 id="what-is-adapter-pattern">What is Adapter Pattern</h2> <p>I think with above problem and the name, you have known what is the Adapter Pattern.</p> <p>Think it like a plug adapter, in Vietnam we usually use Type-A charger for my laptop. But when I travel to other countries using other types, such as Type-I, I don’t want to buy another one to just use in few days then throw it away. I just need to buy a I-A plug adapter or even universal plug adapter to use anywhere with just some dollars.</p> <pre><code class="language-mermaid">---
title: Adapter Pattern
config:
    theme: redux-color
    look: handDrawn
    markdownAutoWrap: false
    class:
        hideEmptyMembersBox: true
---

classDiagram

    ClientInterface &lt;-- Client
    ClientInterface &lt;|-- Adapter
    note for Adapter "serviceData = convertToServiceData(data)&lt;br/&gt;return serviceMethod(serviceData)"
    Adapter *-- ThirdParty

    class ClientInterface {
    	&lt;&lt;interface&gt;&gt;
        +method(data)
    }
    class Adapter {
        -ThirdParty adaptee
        +method(data)
    }
    class ThirdParty {
        ...
        serviceMethod(serviceData)
    }
</code></pre> <h3 id="example">Example</h3> <pre><code class="language-mermaid">---
title: Blog Adapter
config:
    theme: redux-color
    look: handDrawn
    markdownAutoWrap: false
    class:
        hideEmptyMembersBox: true
---

classDiagram

    Blog &lt;-- Client
    note for Client "article = blog.getArticle(id)"
    Blog &lt;|-- BlogMediumAdapter
    note for BlogMediumAdapter "mediumPost = getPost(id)&lt;br/&gt;article = convertToArticle(mediumPost)&lt;br/&gt;return article"
    BlogMediumAdapter *-- Medium

	class Client {
		-Blog blog
		...
	}
    class Blog {
    	&lt;&lt;interface&gt;&gt;
        +getArticle(id) Article
    }
    class BlogMediumAdapter {
        -Medium adaptee
        +getArticle(id) Article
    }
    class Medium {
        ...
        +getPost(id) MediumPost
    }
</code></pre> <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">Blog</span> <span class="o">{</span>
    <span class="nc">Article</span> <span class="nf">getArticle</span><span class="o">(</span><span class="nc">String</span> <span class="n">id</span><span class="o">);</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Medium</span> <span class="o">{</span>
    <span class="kd">public</span> <span class="nc">MediumPost</span> <span class="nf">getPost</span><span class="o">(</span><span class="nc">String</span> <span class="n">id</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">MediumPost</span> <span class="n">post</span> <span class="o">=</span> <span class="n">getPostFromMediumApi</span><span class="o">(</span><span class="n">id</span><span class="o">);</span>
        <span class="k">return</span> <span class="n">post</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">private</span> <span class="nf">getPostFromMediumApi</span><span class="o">(</span><span class="nc">String</span> <span class="n">id</span><span class="o">)</span> <span class="o">{</span>
        <span class="c1">// ...</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">BlogMediumAdapter</span> <span class="kd">implements</span> <span class="nc">Blog</span> <span class="o">{</span>
    <span class="kd">private</span> <span class="nc">Medium</span> <span class="n">medium</span><span class="o">;</span>

    <span class="kd">public</span> <span class="nf">BlogMediumAdapter</span><span class="o">(</span><span class="nc">Medium</span> <span class="n">medium</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">medium</span> <span class="o">=</span> <span class="n">medium</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="nc">Article</span> <span class="nf">getArticle</span><span class="o">(</span><span class="nc">String</span> <span class="n">id</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">MediumPost</span> <span class="n">mediumPost</span> <span class="o">=</span> <span class="n">medium</span><span class="o">.</span><span class="na">getPost</span><span class="o">(</span><span class="n">id</span><span class="o">);</span>
        <span class="nc">Article</span> <span class="n">article</span> <span class="o">=</span> <span class="n">convertToArticle</span><span class="o">(</span><span class="n">mediumPost</span><span class="o">);</span>
        <span class="k">return</span> <span class="n">article</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">private</span> <span class="nc">Article</span> <span class="nf">convertToArticle</span><span class="o">(</span><span class="nc">MediumPost</span> <span class="n">post</span><span class="o">)</span> <span class="o">{</span>
        <span class="c1">// ...</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Client</span> <span class="o">{</span>
    <span class="kd">private</span> <span class="nc">Blog</span> <span class="n">blog</span><span class="o">;</span>
    <span class="kd">public</span> <span class="nf">Client</span><span class="o">(</span><span class="nc">Blog</span> <span class="n">blog</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">blog</span> <span class="o">=</span> <span class="n">blog</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">exampleClientMethod</span><span class="o">()</span> <span class="o">{</span>
        <span class="nc">String</span> <span class="n">exampleId</span> <span class="o">=</span> <span class="s">"abc1234"</span><span class="o">;</span>
        <span class="nc">Article</span> <span class="n">article</span> <span class="o">=</span> <span class="n">blog</span><span class="o">.</span><span class="na">getArticle</span><span class="o">(</span><span class="n">exampleId</span><span class="o">);</span>
        <span class="c1">// Do anything you want with article, e.g. print article's properties</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Main</span> <span class="o">{</span>
    <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">Medium</span> <span class="n">medium</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Medium</span><span class="o">();</span>
        <span class="nc">Blog</span> <span class="n">blog</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">BlogMediumAdapter</span><span class="o">(</span><span class="n">medium</span><span class="o">);</span>
        <span class="nc">Client</span> <span class="n">client</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Client</span><span class="o">(</span><span class="n">blog</span><span class="o">);</span>

        <span class="n">client</span><span class="o">.</span><span class="na">exampleClientMethod</span><span class="o">();</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div> <h2 id="pros-and-cons">Pros and Cons</h2> <p>Pros:</p> <ul> <li>Separating the business logic and interfaces, satisfy the Single Responsibility Principle.</li> <li>Just need to write a new adapter instead of changing the existing interfaces or business logic, satisfy the Open/Closed Principle.</li> </ul> <p>Cons: Because of adding multiple interfaces and adapter, the code will be more complex. Sometime we just need to change a little bit in the business layer.</p>]]></content><author><name></name></author><category term="software"/><category term="design-pattern"/><summary type="html"><![CDATA[The Adapter Pattern is a structural design pattern that allows incompatible interfaces to work together by acting as a bridge or translator.]]></summary></entry><entry><title type="html">Java Virtual Threads Explained: How They Work and When to Use Them</title><link href="https://www.dinhphu28.com/blog/2026/java-virtual-threads-explained-how-they-work-and-when-to-use-them/" rel="alternate" type="text/html" title="Java Virtual Threads Explained: How They Work and When to Use Them"/><published>2026-04-17T09:19:00+00:00</published><updated>2026-04-17T09:19:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/java-virtual-threads-explained-how-they-work-and-when-to-use-them</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/java-virtual-threads-explained-how-they-work-and-when-to-use-them/"><![CDATA[<blockquote> <p>This post continues from <a href="/blog/2026/why-blocking-io-hurts-and-how-asynchronous-fixes-it/">Part 1: Why Blocking I/O Hurts and How Asynchronous Fixes It</a></p> </blockquote> <h2 id="introduction">Introduction</h2> <p>Official release in Java 21, <strong>Virtual Threads</strong> are a lightweight, software-based thread managed by the Java Virtual Machine (JVM) instead of the OS.</p> <p>It allows us to write code in a “linear” synchronous style while getting the high-scale performance of complex asynchronous code.</p> <h2 id="how-virtual-thread-works">How Virtual Thread works</h2> <p>Instead of running on the CPU, Virtual Threads run on top of a small pool of standard OS threads called Carrier Threads usually one per CPU core.</p> <h3 id="mounting-and-unmounting">Mounting and Unmounting</h3> <ul> <li><strong>Mounting</strong>: When a Virtual Thread has work to do, the JVM mounts it into a Carrier Thread, which then executes tasks on the CPU.</li> <li><strong>Unmounting</strong>: When tasks hit a blocking operation (I/O), the JVM unmounts the Virtual Thread from the Carrier Thread. Virtual Thread’s state (stack and variables) will be captured and moved from Carrier Thread to the Java Heap (RAM). The Carrier Thread is now free to process other Virtual Threads.</li> </ul> <p>Once the I/O operation finishes, the JVM parks the Virtual Thread back into a queue. As soon as a Carrier Thread becomes available, the JVM restores the state from RAM and Virtual Thread continues exactly where it left off.</p> <h3 id="why-its-better-os-thread">Why it’s better OS thread</h3> <p>Because the state is just a tiny object in memory (~KB), much lighter than the OS thread. And doing “mount”/”unmount”, CPU does not perform “context switching”, so we can avoid the cost I mention in previous article.</p> <p>So what I mean when I said CPU doesn’t perform “context switching”. The work Carrier Thread do is mounting/unmounting between Virtual Threads. We can assume that the Carrier Thread now acts as the physical CPU core. The “context switching” becomes mounting/unmounting with very low memory cost.</p> <p>The cost for CPU “context switching” is now approximately zero. Because the OS threads are always have work to do.</p> <p>The question is, no matter whether it’s an OS thread or a Virtual Thread, the Blocking I/O is still there. Because the Virtual Threads are just higher level tasks run on OS threads as any other tasks, if it’s blocked by I/O, OS thread is still be blocked, especially we still use the <code class="language-plaintext highlighter-rouge">java.io</code>, right?</p> <p>Don’t worry, the JVM has been rewritten to check every when we call a standard Java I/O method (“atomic” task): If this is a Virtual Thread, unmount it.</p> <h2 id="the-pinning-problem">The Pinning Problem</h2> <p><strong>Pinning</strong> happens when a Virtual Thread gets stuck to its Carrier Thread.</p> <p>The reason is:</p> <ul> <li>Executing inside a <code class="language-plaintext highlighter-rouge">synchronized</code> block.</li> <li>Call native method (JNI), e.g. C/C++, assembly.</li> </ul> <h2 id="async-and-virtual-thread-which-one-is-better">Async and Virtual Thread, which one is better?</h2> <p>We have some reason when we should use each of them.</p> <h3 id="virtual-thread">Virtual Thread</h3> <ul> <li><strong>Readability</strong>: We can write “linear” code, the old-school style that all people have already written.</li> <li><strong>Stack traces</strong>: If an exception thrown in Async, the stack trace often points to the internal async runner rather than our logic. Pollute the stack trace lead to harder to debug.</li> <li><strong>Compatibility</strong>: We can use the existing libraries even if not designed for async. We mostly don’t need to change the old code.</li> </ul> <h3 id="async">Async</h3> <ul> <li><strong>Memory control</strong>: In case we want to build ultra high performance system, async allows us to completely control when the memory is allocated and when tasks move.</li> <li><strong>Java &lt;21</strong>: If we must use any Java version before 21, we don’t have the Virtual Thread so the only choice is async.</li> <li><strong>Pinning problem</strong></li> </ul> <h2 id="conclusion">Conclusion</h2> <p>For most business applications, Virtual Thread is better. It makes coding easier, reduce code complexity. Less effort to join the project.</p> <p>If you have any question or suggestions, feel free to ask or share your opinion in the comment section.</p>]]></content><author><name></name></author><category term="software"/><category term="java"/><category term="concurrency"/><category term="system-design"/><category term="performance"/><category term="virtual-threads"/><summary type="html"><![CDATA[How Java Virtual Threads work under the hood, why they simplify concurrency, and when I choose them over async code.]]></summary></entry><entry><title type="html">Why Blocking I/O Hurts and How Asynchronous Fixes It</title><link href="https://www.dinhphu28.com/blog/2026/why-blocking-io-hurts-and-how-asynchronous-fixes-it/" rel="alternate" type="text/html" title="Why Blocking I/O Hurts and How Asynchronous Fixes It"/><published>2026-04-17T06:52:00+00:00</published><updated>2026-04-17T06:52:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/why-blocking-io-hurts-and-how-asynchronous-fixes-it</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/why-blocking-io-hurts-and-how-asynchronous-fixes-it/"><![CDATA[<h2 id="problem-of-blocking-io-way">Problem of Blocking I/O Way</h2> <p>We know that, not all process in our code is CPU-bound. Some of them are I/O-bound, such as network calls, file system access, database queries, etc.</p> <p>Almost software applications process a lot of I/O-bound tasks. But if and thread is blocked while waiting for an I/O operation to complete, it cannot do anything else. Result in inefficient resource utilization and poor performance.</p> <p>The question is, why we don’t create a new thread for each I/O-bound task? The answer is, creating and managing threads can be expensive in terms of system resources. If we create too many threads, it can lead to thread contention, increased memory usage, and even system instability.</p> <p>E.g: Java OS thread stack size is 1MB by default. If we create 1000 threads, it will consume 1GB of memory just for the thread stacks. Okay, we can pay more memory cost to create more threads, but in reality, it almost doesn’t resolve the problem.</p> <p>And I think with the current prices of RAM, we don’t want to do that.</p> <p>Why it takes so much cost? - “Context switching”.</p> <p>We know that CPU will read from L1, L2 cache (much faster than RAM) before reading from main memory. When a thread is switched out, the CPU needs to save the current state of the thread (including registers, program counter, etc.) and load the state of the next thread to be executed. This process is called context switching.</p> <p>If we always do “context switching”, the cache efficiency will lost. When a thread is switched out, the CPU’s cache may be filled with data that is relevant to the thread that is being switched out. When the next thread is switched in, it may need to access different data that is not in the cache, leading to cache misses and increased latency. Now the CPU will do useless work than doing the actual task.</p> <h2 id="asynchronous">Asynchronous</h2> <p>Because we cannot create a new thread for each I/O-bound task, we need to find a way to allow the thread to do other work while waiting for the I/O operation to complete. This is where asynchronous programming comes in.</p> <p>The async concept is decoupling when and what to execute.</p> <p>How does it work?</p> <p>Think at the lowest level task which just do I/O operation, I call it “atomic task”. Each higher level task is composed of multiple atomic tasks, called “task chain”. When an atomic task is waiting for I/O operation to complete, it can yield control to the next atomic task.</p> <pre><code class="language-pseudocode">CLASS Task
    PROPERTY next

    METHOD run()
        execute((result) -&gt; {
            IF result.success AND next != null THEN
                next.run()
            ELSE IF NOT result.success THEN
                HandleError()
            END IF
        })
END CLASS


// usage
taskA.next &lt;- taskB
taskB.next &lt;- taskC

taskA.run()
</code></pre> <p>Ideally, we have no blocking thread, all of them always be processed by the CPU, so the number of threads should be equal to the number of CPU cores. I will show you this problem and why in Java people usually use more threads than CPU cores right after this.</p> <p>Some folks may ask, finally, the atomic task still needs to wait for the I/O, it still blocks the thread, right?</p> <p>First, instead let the CPU wait, it hands the job of to the OS Kernel using features like epoll, IOCP and the hardware controller will notify the OS -&gt; application when the I/O operation is complete to continue the tasks.</p> <p>But some cases, such as JDBC drivers, do not support async operations, so the thread will be blocked while waiting for the I/O operation to complete.</p> <p>Right, we can think about the R2DBC, but forget it, it’s just a special case about the database drivers.</p> <p>Yeah, we realize that, if we have many JDBC tasks, we have no more thread to do other tasks. So we need create more threads to CPU pause current thread and swap in another one, which continue the task chain. When the JDBC’s thread is completed, the CPU will swap back the thread to continue.</p> <p>With this approach, we can minimize the context switching but still do tasks efficiently.</p> <h2 id="the-backpressure-problem">The backpressure problem</h2> <p>Because in async, we can have many tasks running in fraction of second. We will accidentally DDoS our database, right?</p> <p>No, the thread pool which has a limited number of threads will act as a safety valve to limit number of threads can query the database.</p> <p>But if we let too many threads in thread pool, the DB will crash. Another approach is database connection pool, such as HikariCP. Especially when we use Virtual Thread that allow us to create almost unlimited threads.</p> <h2 id="downside">Downside</h2> <p>For developer, we must manually split the code into pieces. We must write the “start” part, then provide a callback for the “finish” part. The thread is freed because we explicitly ended the function. So the code is harder to read, debug. I think everyone knows “callback hell”. Anyway, we have other syntax makes it easier like async/await in JavaScript, for/yield in Scala. But I think the “linear” coding style like synchronous is easier to swallow.</p> <p>In next article, I will show you how the Virtual Thread works and why we should use it.</p>]]></content><author><name></name></author><category term="software"/><category term="system-design"/><category term="concurrency"/><category term="performance"/><category term="async"/><category term="io"/><category term="java"/><summary type="html"><![CDATA[Blocking I/O wastes CPU cycles and limits scalability. This post explains how asynchronous programming solves the problem and improves performance.]]></summary></entry><entry><title type="html">Bloom Filter and How I Prevent Tons of Message Duplication</title><link href="https://www.dinhphu28.com/blog/2026/bloom-filter-and-application/" rel="alternate" type="text/html" title="Bloom Filter and How I Prevent Tons of Message Duplication"/><published>2026-04-16T10:57:00+00:00</published><updated>2026-04-16T10:57:00+00:00</updated><id>https://www.dinhphu28.com/blog/2026/bloom-filter-and-application</id><content type="html" xml:base="https://www.dinhphu28.com/blog/2026/bloom-filter-and-application/"><![CDATA[<h2 id="problem">Problem</h2> <p>Our Customer Experience Platform processes tens of millions of messages per day, and we need to prevent sending duplicate messages to end users within a recent time window.</p> <p>There are many ways to do this, such as:</p> <ul> <li>Database lookup</li> <li>Saving and looking up message keys in Redis</li> </ul> <p>The first option is a bad idea. With hundreds of millions of messages sent each week, every message would require a lookup to check whether it has already been sent. This takes a lot of time and puts high pressure on the database, even with proper indexing.</p> <p>The Redis approach is quite good if the dataset is small. However, it consumes a lot of memory. With limited RAM, this solution is not ideal.</p> <p>At that time, I thought we must have something better.</p> <p>That is <strong>Bloom Filter</strong> — just a data structure, but it can do this job efficiently with only MB-level memory.</p> <p>The idea is before sending each message, we check whether it has already been sent. If not, we send it and then add it to the Bloom Filter.</p> <p>The downside is that there can be false positives.</p> <h2 id="what-is-bloom-filter">What is Bloom Filter</h2> <p>A Bloom Filter is a space-efficient probabilistic data structure used to check whether an element is a member of a set.</p> <p>Why is it called <em>probabilistic</em>? Because there can be false positive results.</p> <p>In my case, the system checks whether the same message has been sent before. If the result is false, we can be 100% sure that it has never been sent, so we can safely send it without any database or Redis lookup.</p> <p>However, if the result is true, we cannot be completely sure that it has been sent. This is what we call a false positive.</p> <p>A Bloom Filter is a fixed-size bit array. We denote its size as <strong>m</strong>.</p> <div class="row"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/bloom-filter-1.svg-480.webp 480w,/assets/img/bloom-filter-1.svg-800.webp 800w,/assets/img/bloom-filter-1.svg-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/bloom-filter-1.svg" class="img-fluid rounded z-depth-1" width="100%" height="auto" title="quick lookup" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Initialize Bloom Filter with size m=10 </div> <p>We need <strong>k</strong> independent hash functions to compute hash values for each input.</p> <p>When an element is added to the filter, we hash it using all <strong>k</strong> hash functions. Then we take modulo <strong>m</strong> to get indices in the bit array, and set those positions to 1.</p> <p>For example, suppose we want to add <code class="language-plaintext highlighter-rouge">"dinhphu28"</code> to a filter with size <strong>m</strong> = 10 (initialized with all zeros) and <strong>k</strong> = 3 hash functions:</p> \[\begin{aligned} h1(\texttt{"dinhphu28"}) \bmod 10 &amp;= 3 \\ h2(\texttt{"dinhphu28"}) \bmod 10 &amp;= 2 \\ h3(\texttt{"dinhphu28"}) \bmod 10 &amp;= 8 \end{aligned}\] <p>We set the bits at indices 3, 2, and 8 to 1.</p> <p>Now we have:</p> <div class="row"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/bloom-filter-2.svg-480.webp 480w,/assets/img/bloom-filter-2.svg-800.webp 800w,/assets/img/bloom-filter-2.svg-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/bloom-filter-2.svg" class="img-fluid rounded z-depth-1" width="100%" height="auto" title="quick lookup" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Add "dinhphu28" </div> <p>Now, add <code class="language-plaintext highlighter-rouge">"jack"</code>:</p> \[\begin{aligned} h1(\texttt{"jack"}) \bmod 10 &amp;= 1 \\ h2(\texttt{"jack"}) \bmod 10 &amp;= 5 \\ h3(\texttt{"jack"}) \bmod 10 &amp;= 8 \end{aligned}\] <p>We set bits at indices 1, 5, and 8 to 1:</p> <div class="row"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/bloom-filter-3.svg-480.webp 480w,/assets/img/bloom-filter-3.svg-800.webp 800w,/assets/img/bloom-filter-3.svg-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/bloom-filter-3.svg" class="img-fluid rounded z-depth-1" width="100%" height="auto" title="quick lookup" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Add "jack" </div> <p>To check whether <code class="language-plaintext highlighter-rouge">"dinhphu28"</code> exists in the filter, we compute the indices again. Instead of writing, we check whether all corresponding bits are set to 1.</p> <ul> <li>If all bits are 1 → the element <em>probably</em> exists</li> <li>If any bit is 0 → the element definitely does not exist</li> </ul> <p>So why do I say <strong>probably</strong>? Let’s look at this example.</p> <p>Check <code class="language-plaintext highlighter-rouge">"adam"</code>:</p> \[\begin{aligned} h1(\texttt{"adam"}) \bmod 10 &amp;= 5 \\ h2(\texttt{"adam"}) \bmod 10 &amp;= 1 \\ h3(\texttt{"adam"}) \bmod 10 &amp;= 2 \end{aligned}\] <p>All indices (5, 1, 2) are already set, even though we never added <code class="language-plaintext highlighter-rouge">"adam"</code> to the filter.</p> <div class="row"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/bloom-filter-4.svg-480.webp 480w,/assets/img/bloom-filter-4.svg-800.webp 800w,/assets/img/bloom-filter-4.svg-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/bloom-filter-4.svg" class="img-fluid rounded z-depth-1" width="100%" height="auto" title="quick lookup" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Check "adam" </div> <p>Because these bits were set by other elements, the filter returns that <code class="language-plaintext highlighter-rouge">"adam"</code> probably exists — but it does not. This is a false positive.</p> <h2 id="how-i-resolve-the-duplication-problem">How I Resolve the Duplication Problem</h2> <p>Each message has a unique key. In my case, it is a combination of the message template ID and the recipient ID.</p> <p>After sending a message, I add this key to the Bloom Filter. Before sending a message, I check whether the key exists in the filter.</p> <ul> <li>If the result is false → I am sure the message has never been sent → send it</li> <li>If the result is true → it might be a false positive → I drop it</li> </ul> <p>I choose to drop messages in case of uncertainty because the cost of sending duplicate messages is much higher than the cost of dropping a few valid ones.</p> <p>For dropped messages, I push them to Kafka for offline processing to rebuild the dataset and handle them later.</p> <p>With this approach, I can prevent tons of duplicate messages while using only MBs of memory for the Bloom Filter.</p> <p>Some people might ask: what about multiple service instances? How do we share the Bloom Filter?</p> <p>The answer is <strong>Redis Bloom</strong>, a Redis module that provides Bloom Filter data structures.</p> <p>With Redis Bloom, we can share the filter across multiple service instances.</p> <pre><code class="language-mermaid">---
config:
    theme: redux-color
    look: handDrawn
    markdownAutoWrap: false
---

flowchart TD
    A["Event Incoming"] --&gt; B{"Redis Bloom"}

    B -- Hit --&gt; F["Drop Event&lt;br/&gt;(Duplicate)"]
    B -- Miss --&gt; G["Process Event"]

    G --&gt; H["Persist Event Status"]
    H --&gt; I["Add to Redis Bloom"]

    I --&gt; K["End"]
    F --&gt; L

    %% Optional offline recovery
    L["Offline Rebuild Dataset&lt;br/&gt;(Handle False Positives)"]
    L --&gt; K
</code></pre>]]></content><author><name></name></author><category term="software"/><category term="system-design"/><category term="data-structure"/><summary type="html"><![CDATA[Bloom Filter is a probabilistic data structure with low memory cost, used to test whether an object is a member of a set.]]></summary></entry></feed>