p.enthalabs

Fintech Engineering Handbook

w.pitula.me · Read Story HN original

Comments

The idempotency keys section alone is worth the read most devs learn that lesson the hard way.
Also audit trails. Good audit trail can save company (and you) in emergency as well. Useful for debugging and last resort of compliance data source.
I just build the audit trails and skip the non audit trails.

That way I can debug and have a last resort of compliance, but also save time by not building the first resort of compliance.

100%. It deserves more detail, too.

I've spent many hours explaining how idempotency is supposed to work, and why it's important. Most teams understand the need for it, but very few thought about it up front.

I just wish the financial industry itself had known about these when the core banking systems and financial communication protocols of the 60s and 70s were invented that are still being used to this day...

Many of these predate the widespread knowledge of idempotency, so often idempotency keys are hacked together by joining various, hopefully globally unique fields, except that they never quite are. (You can look behind the curtain sometimes, e.g. when your bank does not let you transfer the same amount to the same recipient account on the same calendar day.)

Sorry have to ask these days. Is this carefully written down information from years of experience in the field or AI slop?
Appears that the author got some help organizing the document, but wrote it all themselves.
Whilst I wouldn't say anything in it requires years of experience to know, this would be helpful for someone who hasn't considered anything about monetary systems. It doesn't read like slop, but I could be wrong but even so it all seems fairly reasonable (I've only fully read about 50% before realising there's nothing new here for me, and then skimmed to rest).
Skimmed it and based on my experience in fintech, it looks good, accurately represents the real world. I guess there’s still a chance it is AI generated but it doesn’t seem like vacuous slop, it has substance!
Hey, author here :)

Its at least 80% organic artisanal writing and maybe 20% AI when I needed help with grammar, completeness, broader perspective and everything around.

It may be a good idea to start the book with a really short "About the author" to state exactly this and your work experience. Otherwise looks well written to me, good job! :)
Native English speaker. I scanned it and IMO there's a slight overuse/misuse of hyphens. Maybe the AI tool could be asked to identify and correct? (The hyphens might be triggering people to think it's AI, too).

They mostly need replacing with a full stop or a colon.

E.g.

"In practice this means storing the amount as an integer in its smallest unit - €12.34 becomes 1234"

->

"In practice, this means storing the amount as an integer in its smallest unit: €12.34 becomes 1234."

or

"In practice, this means storing the amount as an integer in its smallest unit (e.g., €12.34 becomes 1234)"

Hyphens here are meant to be formatting. You would be correct if this was a literary piece, but handbooks and sheets don't need to use these rules.
It's a misuse because in this context the hyphen could be mis-construed as a negative amount, and this causes the reader to stop and carefully re-read the content to ensure that they're not misinterpreting it. That's not where you want to be spending your reader's attention.
Spelling, grammar and punctuation matter everywhere, in my opinion.

I wouldn't have read your comment if it were all lowercase and used zero punctuation, for example.

I worked for a few years in consumer banking, and this looks like solid advice.
Word of advice to anyone considering the "minor-units precision" strategy for representing monetary amounts: Don't (or at least, don't use it as an interchange/API data format).

It seems like a clever idea (fast integer math, no rounding problems for addition and subtraction), but it'll bite you incredibly hard if you ever stumble upon an edge case such as working with a partner that has a different implied number of digits for a given currency. This is especially relevant for stablecoins, which often have a different number of implied decimal digits than the "fiat" currency they represent.

Also, consider representing amounts as a string type in JSON-based APIs. JSON does not specify decimal precision, so you (and all your users/vendors) will always have to make sure your parser/serializer doesn't internally lose precision by going via floating point. This can get ugly fast, and while a string seems conceptually less neat, it completely bypasses that problem. (Some will call this an anti-pattern [1], but I'd rather not fight this particular battle for ideological purity on the shoulders of my users or shareholders.)

[1] https://blog.json-everything.net/posts/numbers-are-numbers-n...

What do you recommend instead? Standard floating-point ("float"/"double"), fixed-point arithmetic with thousandths (or smaller) of the minor unit, arbitrary-precision decimal numbers, or something else entirely?
I think what matters most is your database and API representation, as well as having consistent and well-defined rounding rules.

I largely agree with TFA: Round explicitly and consistently whenever you cross a boundary, i.e. database persistence and internal API calls.

Use whatever works for your required business case internally (i.e. inside of procedures calculating some function of one or more input amounts). This can be regular old floats/doubles if you absolutely know what you're doing, or BigDecimal if you aren't and would rather suffer slightly slower performance than having to talk to an auditor about IEEE 754 rounding modes, or even minor-amount integers (yes, even though I just said to not use them – but you'll want to ABSOLUTELY NEVER leak them outside of your system, including your data/analytics pipeline, which might have different ideas about financial amounts than your business logic implementing a nice custom monetary type).

A string type. As parent says: it completely bypasses the problem. Save the numbers between double quotes and be done with it.
Storing numbers as arrays of u8? That doesn't make sense
For JSON serialization, which doesn't support fixed-point precision it does.

Floating-point precision has too many gotchas for being suitable to store Decimal types, especially for the Currency use case.

Surely it does:

  {
    "price": {
      "amount": 1000,
      "decimal_places": 2,
      "currency": "USD"
    }
  }
How is that better than {“amount”: “10.00”} (which also bypasses all potential floating point parsing issues that your or your counterparty’s JSON library might have)?
It is explicit about the fact that that number of decimal places is part of the data.

The semantics for your string “10.00” are complex - is it considered equal to “10”? To “10.000”? To “10.001”?

A user interacting with an API that uses such a string might make all sorts of assumptions about what it supports.

A user interacting with an API that has an explicit decimal places concept is being told ‘decimals matter! They can vary! Here be dragons!’

> The semantics for your string “10.00” are complex - is it considered equal to “10”?

Yes, but "10 USD" would be a non-canonical representation and you probably serialized incorrectly.

> To “10.000”?

Yes, but same caveat as above applies.

> To “10.001”?

Obviously not, and any system you'd ever want to use in a financial context will tell you so.

String and two-field exponent/mantissa representations are mostly the same in terms of semantics, yes. Making it two separate fields makes it less likely it would be put into `parseFloat`, but after doing some research I think strings are more popular in JSON [1, 2], so probably I’d stick to that as well.

[1]: https://msgspec.dev/supported-types#decimal

[2]: e.g. https://getlago.com/docs/api-reference/fees/fee-object#schem..., although they still use `amount_cents` for all currencies as the base rate

It makes a lot of sense if you value correctness over performance.
Why not store them in unary then?
Unary is exactly as expressive as decimal or binary for integers, but somewhat less efficient, so why would you?
idk, why would you store integers as ASCII strings? It's somewhat less efficient.
Because it's much more explicit. Computers are fast, engineering is expensive. You usually never want to optimize prematurely when dealing with monetary amounts.
Even more explicit would be a PNG of the dollar bills. I suggest that.
Except that now you have a new problem: Opinionated theorists that haven’t been part of a nasty “oh no, we accidentally considered some amounts as 10x/100x/1000x larger/smaller than expected” incident in their career yet…
Do not throw away any precision in finance/money computation, regardless what/ how you are doing it.

In C# e.g., there is type decimal for those computations.

You'll definitely have to throw it away at some point.

The art is in making those points well-defined and rare enough to not cause large discrepancies, but frequent enough to avoid ballooning arbitrary-precision numbers across databases and services that might not be able to handle them.

I really like that phrasing! Would you mind if I steal in some form if I decide to review this part of the book?
Not at all, and thanks for writing all of this up!
Floating point value stored multiplied by 10^8. That gives you a huge integer, but it's extremely accurate, especially for US denominated currencies. Easily transformed into floating point numbers for reporting/etc.
> but it'll bite you incredibly hard if you ever stumble upon an edge case such as working with a partner that has a different implied number of digits for a given currency

Why would that be a problem? You just transform the values when interacting with their API.

Sure, but are all your (and your users' and vendors') engineers and LLM agents going to remember that? When in doubt, always be explicit.
I'm curious how you handle that.

Let's say I operate with a 4 decimal expectation and your API expects 6, is there any way to reconcile that outside of documentation and or metadata ? (which would be the same issue I guess whatever representation is used ?)

Yeah, you need to document it.

Still, even if you do: Chances that your users are just going to assume you're conforming to ISO 4217, some national standard, or your competitor that they're already integrated with are pretty high, so I wouldn't take the chance. Pick something that doesn't have to be documented instead.

Exactly, model is in integers and representation can be 1⃣3⃣ or whatever, that's why model-view separation exist.
Sure, you can do that if you can absolutely guarantee that everyone will always respect that separation and there will never be ambiguity between your internal and some partner's representation – even during incidents, even during low-level CSV-to-DB ETLs during incidents ("just one time, I promise, we don't have time to build the proper adapter, but look how similar their and our formats are").
Customer was charged $0.995 after fees, how to represent in your data model with integer cents?
Round it up
Charge $0.995

Refund $1.00

Repeat

Charge $0.995

Charge Actual $1.00

Refund $1.000

Alternately

Charge $0.995

ERROR CHARGE AMOUNT MUST BE ROUNDED TO NEAREST CENT

In my scenario your payment gateway added a $0.005 fee. You told it $0.99.
You'll have to decide when and how to round. Keeping individual billing items at high precision and rounding after summing them up can work; defining and documenting a rounding policy (or complying with whatever's legally required in your jurisdiction/domain) and rounding each individual billed item can as well.
You use 1/1000th or 1/10000th or whatever you need. You do not need “cents”.
Currency: USD Amount: 99500 Decimals: 5
Congrats m8 you invented floating point
Then you have no idea how floating point works. (Hint: its in the name)
How is the client going to pay that? They can't be charged half a cent.
Because a lot of the time there won’t be any error when you’re wrong, just silent data loss.
I’ve seen bugs like this in prod systems. The notional value of the error tends to make the people concerned anything but silent.
Having done HFT / low-latency in C++ with a browser based (read: JavaScript) management front-end: Go ahead and use integer cents everyone. It’s practically an industry standard and it works just fine. Anything else is a worse compromise.
Agree with this, working from HFT to payments to account management in the past.

You can have the blockchain team be an expert in converting integer cents, or the forex team be an expert in sub-cent conversions. You don't want to require _every team_ to have expertise in float math, by default.

Big decimals are widely available and don’t require any expertise but avoid many of the footguns of implied decimal integers.
Imagine advising someone who explicitly said they work in HFT to use big decimals.
BigDecimal should be used by almost everyone except for HFT since they're really slow.
When performance isn't a concern, I largely agree! Not every financial system can use big decimal as their base, though, too. And HFT isn't the only place in the financial sector where this performance concern might pop up.
It is fine as long as you don’t cross any edge cases (crypto, or more recently stuff like AI token pricing) and don’t forget to account for third party quirks (e.g. Stripe’s zero-decimal currencies: https://docs.stripe.com/currencies#zero-decimal).
JPY not having any minor units is arguably not a “third party quirk” but just how the currency works. The same goes for various three decimal digit currencies.
I mean yeah, but it does make things more complicated. Currencies change to 0-decimal, but some systems still expect 2-decimal representation for backcompat reasons (like ISK and UGX in Stripe).
(To clarify some more, we’re in agreement here. My point was mainly that “just storing cents” seems like an easy solution and it might seem to be working well – until it’s not :-)
If you’re only trading in USD and other two-decimal currencies it can work fine, yes. For anything else, it’s much worse as also detailed in TFA.
You provide different handling strategies for different currencies. You also sort currency data alongside your amount. There is nothing complicated or edge-case here.

This works for USD, JPY or $MEMECOIN and it scales very well.

If someone sells you 12345.55 EUR vs USD at a rate of 1.12345, how many EUR do you think you end up with? Do you think all market participants even agree? What if the rate is 1.123456?

For added fun, you can introduce division. Some systems will allow you to sell 12345.55 USD to buy EUR at a rate of 1.12345.

The article’s “no lost data” tenet is not really viable when this sort of division is involved. Are you going to track your account balance is a rational number with an absolutely immense denominator forever?

You store the sums on either end, the currencies, the exchange rate and the final sum? No one has .0000145 cents in their account. Rounding occurs in the real world.
> You store the sums on either end, the currencies, the exchange rate and the final sum?

There is a remarkable amount of disagreement as to whether one should do one’s back office work based on the price or based on the quantity of the counter currency.

> No one has .0000145 cents in their account. Rounding occurs in the real world.

Indeed. But you either need to convince all parties to agree to round the same way or you need to accept small errors

My experience in consumer banking says that every instrument specifies the precision of the calculation, how and when rounding happens, and slew of little details.

So, yes, everyone has to understand how all their partners are doing rounding and summing.

In certain areas of the institutional finance world, everyone seems to accept that everyone's math is allowed to differ by a few cents, and they tally up the errors and move on with their lives.

I would, however, by quite surprised if my personal bank account did this.

When the bank owes you money, they round down.

When you owe the bank money, they round up.

Unless you work for the bank, in which case your annual salary is divided by 12 and rounded up for a monthly figure.
Of course market participants agree?

Exchanges have calculation rules for every type of mark and payment and will always specify rounding.

The only one who has to “agree” is the exchange. And it should be covered by the TOS
> If someone sells you 12345.55 EUR vs USD at a rate of 1.12345, how many EUR do you think you end up with?

1. If someone sells me 12345.55 EUR, I hope to end up with 12345.55 EUR.

2. That's the point though. They will sell you a certain amount of EUR at a certain dollar price. This, in turn, implies a rate (which might well have more than 5 digits behind the decimal). This is ideally close to the quoted rate, sure. But what counts is the actual EUR amount and the actual USD amount, not what rate was quoted or with how many digits.

The only real correct solution here is to send mantissa and exponent as two separate integers. It's trivial to convert between exponents for whatever math you want, it can be as correct as you want, and is unambiguous.

In the HFT space you save some wire space if you can commit to a consistent exponent for some {slice} up front (think instrument/tick-size/asset-class/exchange/feed/server/whatever/...) such that you only need to send the mantissa and your clients can have a hard coded exponent. However, in similar spaces it's often worth the extra uint32 to send a on-the-wire exponent such that things _can_ change and you aren't hamstrung later by earlier "we only need cents now!" design choices when, e.g., you suddenly need to support bitcoin/... prices to full precision. (your users will thank you when they don't have to coordinate a breaking change when you want to adjust your fixed exponent)

If you do that though aren't you just reinventing floating-point?
No, because you're doing decimal floating point, which eliminates the rounding errors of binary floating point.
No, standard floating point implementations have higher precision for smaller numbers than larger. So for example, in a 32bit float, there are far more numbers between 0-1 than there are between 1,000,000 and 1,000,001. For 32bit floats, you start lowing whole integers with relatively small numbers.

Integers have a consistent precision across the entire number line.

> The only real correct solution here is to send mantissa and exponent as two separate integers.

That’s essentially the same thing as a String-serialized big decimal, just less readable, no?

That’s quite a bit slower to process. At least if you’re converting to integers to do the calculations and the calculations would be quite a bit slower if you kept the big decimal type
True, but this is usually your least concern when you're dealing with monetary amounts/math.
They specifically mentioned HFT so I suspect they care a lot about processing speed